1? =t 8 
OF MICHIGAN 


JUL 28 1953 


C A NA D I A N MATHEMATICS 


1 1BRAR 


OURNAL OF MATHEMATICS 


Journal Canadien de Mathématiques 


VOL. V- NO. 3 
1953 


Function spaces Israel Halperin 
Summability defined by Riemann sums J. D. Hill 
Einstein's theory of gravitation K. W. Lamson 
Ultraspherical and Jacobi polynomial sets Fred Brafman 
A theorem of Glaisher Leonard Carlitz 
Weighted quadratic partitions Leonard Carlitz 
Autometrization and the symmetric difference J. G. Elliott 
Some matrix theorems J. K. Goldhaber and George Whaples 
The characters of the symmetric group Masaru Osima 
¥-adic integral representations Jean-Marie Maranda 
Modular representations Hirosi Nagao 
Unitary groups generated by reflections G. C. Shephard 
An extreme duodenary form H.S. M. Coxeter and J. A. Todd 
Numerical integration G. W. Tyler 
Balanced incomplete block designs S. S. Shrikhande 
The existence of difference sets T. G. Ostrom 


On residue difference sets Emma Lehmer 


Published for 
THE CANADIAN MATHEMATICAL CONGRESS 
by the 


University of Toronto Press 





EDITORIAL BOARD 


H. S. M. Coxeter, A. Gauthier, R.D. James, R. L. Jeffery, 
G. de B. Robinson, H. Zassenhaus 


with the co-operation of 


A. S. Besicovitch, R. Brauer, D. B. DeLury, P. A. M. Dirac, 
R. Godement, I. Halperin, L. Infeld, S. MacLane, G. Pall, 
L. Schwartz, J. L. Synge, W. J. Webber 


The chief languages of the Journal are English and French. 


Manuscripts for publication in the Journal should be sent to the 
Editor-in-Chief, H. S. M. Coxeter, University of Toronto. Everything 
possible should be done to lighten the task of the reader and the notation 
and reference system should be carefully thought out. Every paper 
should contain an introduction summarizing the results as far as possible 
in such a way as to be understood by the non-expert. 


All other correspondence should be addressed to the Managing 
Editor, G. de B. Robinson, University of Toronto. 


The Journal is published quarterly. Subscriptions should be sent 
to the Managing Editor. The price per volume of four numbers is 
$6.00. This is reduced to $3.00 for individual members of the 
following Societies: 


Canadian Mathematical Congress 
American Mathematical Society 
Mathematical Association of America 
London Mathematical Society 

Société Mathématique de France 


The Canadian Mathematical Congress gratefully acknowledges the 
assistance of the following towards the cost of publishing this Journal: 


University of British Columbia 
Carleton College Ecole Polytechnique 
Université Laval Loyola College 
University of Manitoba McGill University 
McMaster University Université de Montréal 
Queen’s University Royal Military College 
St. Mary’s University University of Toronto 

National Research Council of Canada 
and the 
American Mathematical Society 


AUTHORIZED AS SECOND CLASS MAIL, POST OFFICE DEPARTMENT, OTTAWA 








FUNCTION SPACES 
ISRAEL HALPERIN 


1. Introduction. This paper is the first in a series dealing with Banach 
spaces L whose elements are functions on a measure space S. If W is a family 


of non-negative weight functions w., we sometimes write Ly” when the norm 
is given as 


\f | supe ( f'1f(P)/? wa(P) dy(P)) ‘ l<p<-, 


SUPa (wa sup | f(P) |) p= © 


’ 


(here w.-sup means: supremum, neglecting sets on which w.(P) = 0 for almost 
all P). We sometimes write Ly)? when W consists of all the functions equi- 
measurable with a single w(P) (w(P) is required to satisfy a weak condition, 
see §2). When w(P) is identically 1, L;,.)? reduces to classical L? space. 

In §§3 and 4, a Hélder type inequality, of some interest in itself, is proved 
(in more general form than actually required elsewhere in this paper). In 
§§5 and 6, using this inequality, we determine explicitly the conjugate spaces 
L*, L** when L is of type L,,)”. It turns out that the case: S has infinite measure 
but w has finite integral on S, is pathological. Excluding this case we show: 
(i) if 1 < p < @ then L* is a new generalization of classical L” space; (ii) if 
1<p< _o then L,)” is reflexive. In the pathological case, L;,)” fails to be 
reflexive for every p. 

The main result of the present paper solves for a special case a very deep 
problem indicated in [1, p. 182]: if a linear vector space carries a family of 
norms and a new norm is defined as the supremum of the given norms, what is 
the nature of the conjugate space to the new space? An extension to vector 
valued functions will be given by H. W. Ellis and the author in [2]. 

Since references to the present paper occur in the literature, it is remarked 
that the results of this paper were found by the author and embodied in a 
manuscript in the summer of 1950 at the Research Institute of the Canadian 
Mathematical Congress. Function spaces which could be considered as special 
cases of the L,,.)” had been defined previously by G. G. Lorentz [3] but discussed 
by him only for the case p = 1. 


2. Terminology. Throughout the papers in this series we suppose 
lL<gp<qo,l<gq< o, with p'+ ¢" = 1, interpreting ~~ as 0. We let 
S denote a space of points P with a non-negative, countably additive set function 
v defined for a non-empty family of »-sets which includes relative complements 


Received October 7, 1952; in revised form April 2, 1953. 


273 











274 ISRAEL HALPERIN 


and countable unions of its members and is such that »(.S,) = 0 implies that 
every subset of S; is a pv-set. 

If ES, is a v-set for every v-set S,; with v(S;) finite, we define y(£) to be the 
supremum of such »(ZS,); measure, measurability, integral and ess. sup will 
refer to y. The change from » to y enables us to disregard sets S,; which are 
purely infinite (i.e. v(S,) = @, but 0 < »(S:) < © is false for all S: C S;) 
without actually deleting them from S. Both for abstract S and for Euclidean 
space, where m in place of y denotes Lebesgue measure, the letter E will be used 
for arbitrary measurable sets; e will always denote a set of finite measure. 
‘f and ess. sup refer to the entire space when no subset is indicated. We sometimes 
write y for y(S). 

{o(P)} = {P; o(P)} means the set of P for which ¢(P) holds. 

B will denote a real or complex Banach space, B* its conjugate (B may con- 
sist of the real or complex numbers). If c € B,v € B* then cv = uc is the value 
of v at c; c(P) is the function with value c for all P. For point or set functions 
f(P), F(e), we define fz, Fg to coincide with f, F respectively on E and to 
vanish outside E; fy(P) shall equal f(P) if |f(P)| < N and N{f(P)|-s(P) 
otherwise; c,, ,is an abbreviation for c, withc = c;,e = e,;f is finitely (countably 
valued if f = >>; c,,, with finite (countable) disjoint e,; f is Bochner measurable 
if for every e and every n there is a finitely valued f; such that the subset of e 
for which 'f (P) — fi(P)| > 1/n has measure less than 1/n. 

Two functions will be identified without comment if they differ only on a 
set of measure zero; all numerical valued functions considered in these papers will 
be measurable. 

Non-negative functions f;(P), f2(P) will be called equimeasurabdle if 
vifi(P) > k} = vtf2(P) > Rk} for all k > 0. This need not imply the stronger 
relation with > inside the braces. 

We allow © asa value for a non-negative function with the usual conventions 
0o =0,k/0 = ~ ifk>0,k/o =0if0 < k < @; we adopt the convention 
that k < 0/0 is valid for all k > 0. 

In §§3 and 4, wu, v, and a fixed w, called the weight function, are non-negative 


functions on (a, 6), — © <a <b < @ and we define v°(x), the level function of 


v with respect to the fixed w, in Definition 3.2. 

u(a,, ;) will denote the integral of u on (a;, b;); (a1, 51) is called u-null if 
u(ay, b;) = 0 and u-null maximal if, in addition, it is not contained in any 
other u-null interval. We shall suppose that 0 < w(a, x) < @ for alla < x <b; 
@ means a if w(a, x) > 0 for alla < x < 5, otherwise @ = sup, {x; w(a, x) = 0}. 
We shall suppose that 0 < w(a, b) < @. For given v we let R(a;, b;) denote 
v(a1, 6:)/w(ai, d:) if w(a1,d:) < © and lim sup v(a;, t)/w(a;,t) as t— >, if 
w(a;, b;) = ©; R® refers to v° in place of v. 


|u| = |u|, shall mean: 
l/p 
( fuceyrce) ax) 





™aer 


— 


FUNCTION SPACES 275 


if 1< p< @ and w-sup u(x) if p = @. If v(a,a) > 0 then |v) = [x], shall 
mean ©; if v(a,Z) = 0 then [v] = [v], shall mean 


( fo /w(x))*w(x) ax) ; 


ifl <q < @ and w-sup (v°(x)/w(x)) ifg = @. 

v will be said to have the A-property if w(a,, b:) > 0, w(ae, b2) > 0, ay < ao, 
b, < b2 <b always imply © > R(a,, b:) > R(ao, bz); v will be called non- 
increasing relative to w if v(x) = D(x)w(x) for @ <x <b with D(x) finite, 
non-negative, non-increasing. v will be called w-infinite if v(x) = © whenever 
w(x) > 0. v < v, will mean: v(a, x) < v, (a,x) for all a << x < bs vu Kv, will 
mean the stronger relation: v(x) = (x) for a < x < @ and v(4, x) < »;(4, x) 
foralld <x <b. 

In §§5, 6 we consider arbitrary f(P), g(P) and a fixed non-negative w(P) on S 
(there should be no confusion between w(P) and the w(x) of §§3, 4). The left- 
continuous non-increasing rearrangement of |f(P)| is defined to be a function 
f*(x) on0 < x < yas follows: f*(0) = ess. sup |f(P)| and for x > 0, f*(x) = sup 
k with vi{lf(P)| > k} > x. In these sections, |f| = |f|, shall mean: 


SUPa ( firey ’wa(P) ax(P)) 


if 1 <p < ©, and sup, (w.-sup |f(P)|) if p = ©, where w, varies over all 
functions equimeasurable with w. 

Omitting some trivial cases we shall suppose that for every k < @, the 
supremum of the integral of w over sets of measure < & is finite and that the 
integral of w over S is greater than zero. We shall distinguish the three possibili- 
ties: Case (C,) with y < @ ; Case (C2) with y = @ and integral of w over S 
infinite; Case (C;) with y = © and integral of w over S finite. We shall suppose 
that w is restricted by the condition: |f|, defined above agrees with |f*|, as 
defined for §§3, 4. with w*(x) as the weight function on (0, y) in place of w(x) 
on (a, 5). It is easy to verify that this condition is satisfied if w(P) is constant 
on S, more generally if w*(x) is constant on (0, y); for any other w this condition 
is equivalent to the requirement that either S has no atomic sets ¢ (i.e. y(e) > 0 
and ¢; C e implies y(e,) = 0 or y(e — e:) = 0) or every measurable subset of S 
of finite measure is a union of atomic sets of equal measure. 

With this w(P) we define [g] = [g], to agree with [g*], as defined for §3, 4 
where the weight function to be used shall be w* on (0,7). 

L = Ly” and M = My .»* will denote the spaces whose elements are the 
numerical valued f, g with finite norm |f|,, [g], respectively. Lyq)?(B), Mi)*(B) 
shall denote the corresponding spaces when f, g are valued in B and are Bochner 
measurable. L;,.)?(B), as well as the more general Lw’(B), are obviously linear, 
normed spaces and for M,,.)*(B) this will be shown in §5; the remainder of the 
proof that all Ly’(B) and all My.*(B) are Banach spaces (i.e. the proof of 








276 ISRAEL HALPERIN 


completeness) will be omitted in this paper since a more general result will be 
given in [2, Theorem 3.1]. 

S is said to have property (R) if there is a family, not necessarily countable, 
of disjoint e, such that an arbitrary S; is measurable and y(S;) = 0 whenever 
Si¢2 is measurable with measure 0 for every a. 


3. Level intervals and level functions. The constructions and results of 
this section are required to solve the Hélder inequality problem of the next 
section. We refer to §2 for terminology. 


DEFINITION 3.1. (a;, 5:), with a < a; <b; < 8, is called a level interval 
(of » with respect to w), abbreviation L.i., if for all a, < x < b;, w(a;,x) > 0 
and R(a;, x) < R(a, b:). If the Li. is not contained in a larger 1.i. it is called a 
maximal level interval, abbreviation m.1.i. 


We note that if w(a;,x) > 0 for all ay < x < b; and R(a;, b;) = © then 
(a, 6;) isa Li., R(@, 6) = @ and (G, d) isa m.Li. 


THEOREM 3.1. 

(i) Every 1.i. is contained in a m.1.i. 

(ii) If (a1, by), (@2, be) are L.i.’s with ay < a2 < by < be then (a, be) is a@ Li. 
(iii) The m.Li.’s are non-overlapping and denumerable. 


Proof of (i). Suppose a; > a2 >...,0:1 << b2 <..., do = inf a,, bo = sup 4,. 
If each (a,, 5,) is a Li. it is easily verified that (ao, bg) is a l.i. Now for arbitrary 
li. (a, by) the a,, 6, can clearly be chosen so that (do, b») will be a m.L.i. 


Proof of (ii). w(a:,x) > 0 for a; < x < be since (a;, b;) is a Li. We may 
suppose R(a;, b2) < ». Then R(a;, a2) < R(ai, b1) < R(@2, b1) < R(@2, b2) < 
R(b,, 62), implying R(a;, b:) < R(a1, b2) < R(@2, b2). It follows that (a, 52) is 
a Li. for ifa; < x < b; then R(a;, x) < R(a1, 51) < R(as, be); and if b} < x < by 
then R(a;, b2) < R(d2, b2) < R(x, be) which implies R(a;, x) < R(ai, be). 


Proof of (iii). This follows at once from (ii). 


Remark 1. A w-null, v-null interval, like a single point, may or may not be 
part of a L.i. But Definition 3.1 implies that it can not be at the beginning of a 
Li., and either all or none of it is part of a m.L.i. 


Remark 2. If (a, 61) is w-null but not v-null and 5; > @ then (a,, 5:) is part 
of a Li. Indeed a < a; and we may clearly suppose that R(G, b) is finite and 
(ay, 5;) is w-null maximal so that w(x, b,) > 0 for alla < x < a;. Then R(x, d;) 
is finite and continuous for a < x; < a; and diverges to © as x — a; hence it 
assumes its minimum value and we let a2 denote the maximum x for which 
this mimimum is attained. Then a < az < a; and for az < x < a;, R(d2, db) < 
R(x, b;). This implies: w(a2,x) > 0 for all ag <x <b and (ae, d;) is a Li. 
containing (a, 5,). (If a, = a = — @, the argument is still valid.) 





rary 


nay 


) is 


FUNCTION SPACES 277 


Remark 3. If a < @ then no part of (a, @) can be part of a Li. 


DEFINITION 3.2. v°(x), the level function of v with respect to w is defined by: 
(i) v°(x) = R(a;, b;)w(x) if x is interior to a m.Li. (a, 5;), 
(ii) v°(x) = v(x) for all other x. 

If v° = v then 2 is called a level function. 


Remark 1. If (ai, 6:) is w-null and 6; > @ then Remarks 1 and 2 following 
Theorem 3.1 imply that v°(x) = 0 for a; < x < dy. 


Remark 2. lf a # @ then v°(x) could be obtained by the equivalent definition: 
on (a, @) define v°(x) to be v(x) and on (@, 6) use Definition 3.2 but with w, v 
considered as functions on (G, }) in place of (a, db). 


Remark 3. If (d:, 5) is v-null then v°(x) = 0 for b; < x < b. If (b),6) isa 
Li. of v and R(d,, 6) = 0 then (dy, }) is v-null. 


Remark 4. If v;(x) = v(x) except on a Li. (a;, b;) of » and »;(x) = R(a;, b;) 
w(x) for a, < x < }, then v,° = v°. 


Remark 5. If R(G@, 6) = @ then v° is w-infinite. 


THEOREM 3.2. 

(i) v(ai, x) < v°(a1, x) if a, is not interior to a 1.i. of v and equality holds if 
neither a, nor x is interior to a 1.i. 

(ii) v <v° and v°(a, b) = v(a, d). 

(iii) R(G@, 6) = R°(G, bd). 

Proof of (i) and (ii). If (a;, b:) is a m.Li. and a, < x < }; then Definition 
3.2 implies that v(a;, x) < v°(a;, x) with equality if x = b,. Since v°(x) = v(x) 
outside the m.1.i.’s, this gives (i) and (ii). 

Proof of (iii). We may suppose w(a, b) = © and R(@, b) < @. If now there 
is am.l.i. (a@;, 6) then (iii) holds; if there is no such m.1.i. then (iii) follows from 
the relations: R(@, x) = R°(@, x) when x < b and x is not interior to a m.L.i., 
and R(G@, x) < R°(@, x) < max (R(G, a;), R(G, b:)) when x is interior to a m.1.i. 
(ai, b;) with a,> a. 

THEOREM 3.3. 

(i) Every Li. of v is a 1.i. of v°. 
(ii) Every m.1.i. of v° is a 1.i. of v. 
(iii) v and v° have the same m.1.i.'s. 
(iv) On each of its 1.i.’s v°(x) = kw(x) with k constant on the 1.i. 
(v) v°° = v°. 

Proof of (i). On each Li. (a;, b:) of v, v°(x) = kw(x) so that R°(a,, x) is 
constant for a; < x < by This implies (i). 

Proof of (ii), (iii), (iv), and (v). If (a;, 6;) is a m.1.i. of v° then by (i), neither 
a, nor }, is interior to a Li. of v and hence for a; < x < dy, 


R(a,, x) < R° (ay, x) < R° (ay, by) = R(a,, by). 











278 ISRAEL HALPERIN 


This proves (ii). Now (iii), (iv), and (v) follow from (i) and (ii). 
THEOREM 3.4 If a; < bi < by and O < w(bi, b2) < @© then 
R°(b;, be) < R° (ai, be). 


Proof. We may suppose R°(a;, b:) < ©. Then R°(x, b2) is finite and con- 
tinuous for a; < x < b;. Suppose, contrary to the statement of the theorem, 
that R°(a,, bs) < R°(d;, b2) and let x» be the maximum x at which R°(x, 52) 
assumes its minimum value. Then x» < 6, R°(xo, b2) < R°(x, be) for x» < x < dy 
and w(x», x) > 0 for x» < x < 6. Now let x; be an x at which R°(xo, x) on 
b, < x < be assumes its maximum value. Necessarily 5; < x;. If x» <x < d; 
then R°(x, x) < R°(xo, b2) < R°(xo, x1); if bs < x < x1, R°(x0, x) < R°(x0, x1). 
This implies that (xo, x:) is a Li. of v°, hence that R°(xo, x) is constant for 
x9 <x < x1, contradicting the previous inequality R°(x9, b:) < R°(xo, x1). 


THEOREM 3.5. If a; < bi < be, 0 < wlas,b1) and w(a;,b2) < © then 
R° (ai, be) < R° (ai, by). 


Proof. If w(b;,62) = 0, the Remark 1 following Definition 3.2 implies 
v°(x) = 0 for b; < x < bz and hence the theorem. If w(d;, b2) > 0, the theorem 
is a corollary of Theorem 3.4. 


THEOREM 3.6. For a function v the following are equivalent: to be a level function 
but not w-infinite; to be non-increasing relative to w; to have the A-property. 


Proof. in view of Theorems 3.4 and 3.5 we need only prove that the A- 
property implies that v is non-increasing relative to w. Let E denote the set 
union of the denumerable family of closed w-null maximal intervals and EF’ 
the set of x with a < x < band <x notin E. Forx in EF’ andt > Oset H(x, t) = 
R(x, t:) with 4; = min (x + ¢, 3(x + 5)). For fixed x, H(x, t) is non-increasing 
as ¢t decreases to zero and for fixed ¢, H(x,t) is non-increasing in x. Hence 
D(x) = lim H(zx, t) ast — 0 exists for all x in E’ and is finite, non-negative and 
non-increasing for these x. Since 


v(x, t1)/ (i —-x)= H (x, t) w(x, ty) (ty = 2), 


the fundamentai theorem of the (Lebesgue) calculus shows that v(x) = 
D(x)w(x) for almost all x in E’. But for almost all x in E with x > G@, we have 
v(x) = w(x) = 0 since the A-property implies v(a:, b;) = 0 whenever (a,, 5,) 
is w-null with 6, > @. It follows that D(x) can be so defined for the x which are 
> @ and in the closed intervals which constitute E that v(x) = D(x)w(x) will 
hold for all x > @. 

THEOREM 3.7. 

(i) v < v, implies v° < 0,° and v <2 implies v° K 2°. 

(ii) If v; is a level function, v < v, implies v° < v, and v XK v implies v° K 2. 

(iii) v° can be characterized among the level functions (equivalently, the functions 
which are w-infinite or non-increasing relative to w) v, with v Kv, as the one for 
which v;(@, x) attains the minimum value for every G < x < b. 





nm 


Dy. 
ns 
or 


FUNCTION SPACES 279 


Proof. We shall show that v < », implies v°(a, x) < v;° (a, x) for a < x < b. 
We may clearly suppose v(a, x) < @ for alla < x < band R(a,b) < @ 
If x is not interior to a m.1.i. of v then 


v°(a, x) = v(a, x) < v;(a, x) < v,°(a, x). 


If x is interior to a m.1.i. (@;, 6;) of v with w(a,, b;) < ©, then 


~ 


v°(a,x) = v(a, ai) + went) v(a,, db) 
= #0, a)(1 ~ StH 5%) 4-6, by Oued 
i to Ro 
= v,(a,a:) + wien £) 11° (a1, bs) < 1°(a, x), 


since v,;° is either w-infinite or non-increasing relative to w. 
Finally, if x is interior to a m.l1.i. (a), 5;) of v with w(a;, b;) = ©, then db; = b 


and 


v°(a, x) = v(a, a1) + R(ai, b)w(as, x) 


v1° (ai, x) 
w(ai, x) 


< 0,°(a, a1) + 


w(a;,x) = v,°(a, x), 


since 01°(a;. t)/w(a,, t) is non-increasing and has limit > R(a, b) when t > bd. 


4. D-type Holder inequalities. We refer to §2 for terminology. u,v, w are 
non-negative throughout this section. 


THEOREM 4.1. Jf u(x) is non-increasing on (a, b) then v < v, implies that for 
ala<x <b, 


fomue dt < f nue dt. 


Proof. \t is sufficient to prove the theorem for u of the form: u(x) = k, on 
(ay, 241) with a = a; < a2 <... < dni, = 6 and ki > ko >... >k, >O; 
using Abel’s rearrangement, it is sufficient to prove the theorem for u(x) = k 
on (a, a,) and 0 elsewhere, for arbitrary k > 0; and this follows directly from 
v < %}. 


THEOREM 4.2. If u varies over all non-increasing functions with |u|, < 1 
then 


6 
(4.1) sup f u(x)v(x) dx = [v],, 








280 ISRAEL HALPERIN 


and for arbitrary v and arbitrary non-increasing u, the D-type (i.e., decreasing type) 
Holder inequality holds: 


f ue) dx < |u|,[v]q. 


Proof. Both sides of (4.1) are infinite if any of the following hold: (i) v(a, @) 
> 0, (ii) R(@, b) = @, or (iii) w(a, 6) = ©, p < ~, and R(G, bd) > 0. To verify 
that the right-hand side of (4.1) is infinite: if (i) holds, choose u(x) = k on 
(a, @), = 0 elsewhere and let k — - ; if (ii) or (iii) holds, let ¢ be fixed and choose 
u(x) = w(a,t)-'” on (a,t), = 0 elsewhere and let t—> 5b. We may therefore 
suppose none of (i), (ii), or (iii) holds; then v°(x) = 0 whenever w(x) = 0 
and v°(x) = D(x)w(x) for all x for some D(x), non-negative, non-increasing, 
and finite for x > @. 

Now by Theorem 4.1 and the ordinary Hélder inequality, 


f u(x)v(x) dx < J u(x)v°(x) dx = f u(x)w(x)'v°(x)w(x) "dx < [v],. 


Thus < holds in (4.1). 

If p = @, then > holds in (4.1) as can be verified by choosing u(x) to be 
identically 1. 

If p < &, consider any ¢ not interior to a m.l.i. of » for which w(a,t) < @ 
and set u;(x) = (Dy(x))*' on (a, t), = 0 elsewhere. Let u(x) = |m;|,~! u(x). 


Then ||» = 1, u(x) is non-negative, non-increasing, and constant on each 
m.L.i. of v. With this u, 


d et l/¢ 
f u(x)v(x) dx > (| (Dy(x))*w(x) ax) ; 


Since N is arbitrary, this proves that the left-hand side of (4.1) is greater than 


or equal to 
( f D(x)"w(x) de) "i ( f (E@)‘wve) ar) “ 


for every such ¢. This obviously implies > in (4.1) except when w(a, b) = @ 
and v has a m.1.i. (;, 5); but for this case, R(@, b) = 0 (since p < @), hence 
v°(x) = 0 on (d;, 5), and > in (4.1) is obtained when t = )y. 

This completes the proof of the theorem. 


COROLLARY. v < 0, in particular v(x) < v(x) for all x, implies |v), < {v:),: 
v,{x) < v(x) for all x together with v,(x) v(x) as n—> @© for each x, implies 
[ral — [v}e; low], < [vr], and lon], _ [v], as N> @; [vy + Vel, < [va], + [v2]. 


Remark 1. Theorem 4.2 implies the more general theorem: with fixed non- 
negative u(x) let u(x) be restricted to functions of the form u;(x)uo(x) with 
u;(x) non-negative, non-increasing on (a, b) and |u|, < 1; then 


U) b 
sup f u(x)v(x) dx = cup f Uy (x)uo(x)v(x) dx = [vuo],. 


a 








pe) 


lan 


FUNCTION SPACES 281 


It follows that if u is fixed, not necessarily non-increasing, and v varies over all 
level functions on (a, b) with [v], < 1, then 


b 
(4.2) sup f u(x)v(x) dx = [uw], 
and clearly the right-hand side of (4.2) is equal to |u|, if u is non-increasing. 


Remark 2. Standard arguments now show that the supremum in (4.1) is 
actually attained except when p = 1, |v] < ~, D(@ +0) > R(G@,x) for all 
a <x <b all hold. Consequently, if u(x)v(x) has a finite integral whenever 
u(x) is non-increasing with |u|, < 1 then [v], < @. 


Remark 3. The supremum in (4.1) will not be changed if on each of an 
arbitrary family of non-overlapping level intervals (a,, b,) of v, v(x) is replaced 
by R(a,, 6,)w(x). The proof of Theorem 4.2 also shows that the supremum in 
(4.1) will not be changed if u(x) is further restricted to be constant on each of 
the level intervals (a,, 6,;) for which w(a;, b,) < @. 

It follows that if on each of a family of finite non-overlapping intervals, 
v(x) is constant and w(x) is non-increasing, then the supremum in (4.1) will not 
be changed if u(x) is further restricted to be constant on each of these intervals. 


Remark 4. The preceding results of sections 3, 4 apply if all functions 
u, v as well as the weight function w, are required to be constant on each of a 
fixed but arbitrary family of non-overlapping sub-intervals of (a, ). If (a, d) is 
entirely subdivided into such intervals of equal length, there results the cor- 
responding theory for finite, infinite, or doubly infinite sequences with integrals 
replaced by sums. In this case the supremum in (4.1) is attained for all cases. 


Remark 5. Theorem 4.2 and the Remarks above remain valid if u(x) is 
further restricted to have finite value for every x, with the following exception: 
if v(a, Z) > O then the supremum in (4.1), namely ©, may not be attained for 
such u in the general situation described in Remark 4. Thus, for sequences, if 
a < G, the finiteness of = u,v, for all non-increasing u, with |u|, < 1 andu, < © 
for every n, need not imply that [v], < @. 


5. The spaces L;,.)? and M,,..*. We refer to §2 for terminology. 

The definition of f* implies: f*(x) < fi*(x) for all x if |f(P)| < |fi(P)| for 
almost all P; f,*(x) = 0 for all x > y(e); fy*(x) = (f*)w(x) for all x; if |f,(P)| 
<|f(P)| and \fa(P)| — |f(P)| as n— @, for almost all P, then f,*(x) < f*(x) 
and f,*(x) ~f*(x) as n— @ for all x; if 1 <p < @, the left-continuous, 
non-increasing rearrangement of the function | f(P)|? is equal to f* (x) for all x; 
(fi + fe)* < (fi* + fe*). 

Theorem 3.7 (i), together with the Corollary to Theorem 4.2, now show that 
(gi. + ge] < [g:] + [g2] and from this it follows easily that M,,.*(B) is a linear 
normed space. Furthermore if |g,(P)| < |g(P)| and g,(P)| + \e(P)| asn— @ 
for almost all P then [g,] < [g] and [g,] — [g] asn — @. 














282 ISRAEL HALPERIN 


The hypotheses on w ensure: |f|, = |f:|, and [g], = [g:], if the non-negative 
functions |f(P)|, |f:(P)| and |g(P)|, |gi(P)| are equimeasurable, respectively. 

Throughout the remainder of §§ 5 and 6, /(P) and g(P) will denote numerical 
valued functions not necessarily in Ly.)?, Mi.) respectively. 


THEOREM 5.1. 
(i) If f varies subject to the condition |f\, < 1 then 


(5.1) sup Mite | g(P)| dy(P) = Ig), 


“ 


and if [g|, < @ then 


sup | J sr.) avr) = (gle. 


(ii) Lf g varies subject to the condition |g|, < 1 then 


Pp 


(5.2) sup { \f(P)| | g(P)| dy(P) = |f 
7S 


and if |\f\, < © then 


sup | | f(P)g(P) dy(P)| = |f lp. 
Ss 


Proof of (i). It is now clear that we need establish (5.1) only for finitely 
valued g(P) and with f further restricted to be finitely valued. With such g 
and f it is easy to verify that 


(5.3) sup Jis@ | g(P)| dy(P) = cup f f* (x) g* (x) dx. 


Since w* (x) is non-increasing, and g* is a step function, the Remark 3 following 
Theorem 4.2 implies that the right-hand side of (5.3) is equal to [g],. 


Proof of (ii). We need establish (5.2) only for finitely valued f(P) and with g 
further restricted to be finitely valued. The argument of (i) together with 
(4.2) proves (ii). 


Remark. The proof given for Theorem 5.1 shows that if g(P) is constant on 
each of a countable set of disjoint e,, then the supremum in (5.1) is not changed 
if f is further restricted to be constant on each e,. Similarly, if f(P) is constant 
on each of a countable set of disjoint e,, then the supremum in (5.2) is not 
changed if g is further restricted to be constant on each e,. 


THEOREM 5.2. 

(i) If f:(P) and f.(P) are different from zero on disjoint sets, i.e. f,(P)f2(P) = 0 
for almost all P then: if p = @, |\f; + fo) = max (\f,|, |fol); and if 1 <p < @, 
fi + Sal? < lfil? + (fal? 

(ii) If gi(P)ge(P) = 0 for all P then: 





fe 





FUNCTION SPACES 


lg: + ge) = [gi] + les), qg=1 
(gi + g2]* > (g:]* + [ge]*, l<q<o@ 
q = @ 


lg: + g2] > max ((g:], [g2)), 
Proof of (i). When p = @,/f| coincides with ess. sup |f(P)|. When 1 < p < @, 


7 
fil? + | fel? = J (i fal?" (x) + | fol?" (x)) w* (x) dx 


*Y 
2 J (fil? + | fel”)" (x) w* (x) dx 


7 
f ifi + fel?" (x) w* (x) dx = | fi + fel’. 


Proof of (ii). When g = 1, [g] coincides with the integral of |g(P)| on S. 
Theorem 5.1 shows that [g: + g2] > max ((g:], [ge]) for all g, so that we need 
consider only the case 1 < g < @ and we may clearly suppose [g,], [g2] both 
finite and positive. Then for any « > 0, Theorem 5.1 implies that there are 


f,, fe with fd = [g,]*' and 
f | FCP )| | ec(P)| dx(P) > led" - « 


i= 1,2. 


It may clearly be supposed further that f,(P) = 0 wherever g,(P) = 0 so that 


f,(P)fe(P) = 0 for all P. It follows that 


fig + f2)(P)| |(g: + g2)(P)| dy(P) > [gil* + [ge]’ — 2 


and 
lfitfi < (| fil” + | f2|”)" '= ({g:}* + (g2}*)' 
The validity of (ii) follows at once. 

Remark. It is easy to show that the inequalities can be replaced by equalities 
only if w*(x) is a constant (then L,,,)?, M:.)* coincide essentially with classical 
L?, L*) or if p = @© for Ly)? (which is actually identical with classical L” for 
all w), or if g = 1 for Mi )* (which is actually identical with classical L' for 
all w). 

THEOREM 5.3. 

(i) Tf fi(P)fe(P) = 0 for all P and 1 [ p < @ then \f, + fol = lfi| < @ 
implies f2(P) = 0 for almost all P, in Case (C,) if w*(x) > 0 for allO < x < y, 


and in Case (C2). 
(ii) Tf gi(P)g2(P) = 0 for all P and 1 <q < © then [g; + g2] = [g:] < @ 


implies g2(P) = 0 for almost all P. 


Proof. (ii) is an immediate corollary to Theorem 5.2 (ii). 











284 ISRAEL HALPERIN 


If (i) were false there would be an « > 0 such that vi lfe(P)| > a> « 
Suppose 7{|f:(P)| > «} =A. Then f;*(x) > « for all 0<x <A so that A 
must be finite in Case (C,) or Case (C2). Then (f; + f2)*(x) > f:*(x) for all x 
and (f; + f2)*(x) > «> fi*(x) for all A <x <A-+e. Since w*(x) > 0 for 
A<x<A-+eand |f,| < © we would deduce |f, + fe] > fi], contrary to the 
hypothesis. 


THEOREM 5.4. 

(i) fl < Lfl; Lfr| — lf, as N— @; if f| < @, then |f — fy| ~O0as N— @. 

(ii) If. < lfl; sup |f,| (for all e) = fl; v(e) +0 implies |f.| +0 whenever 
lf < @ tf and only if either 1 < p < @ or p = @ and for some A > 0, y(e) = 0 
whenever y(e) < A. 


Proof. Parts of (i) and (ii) follow easily from the definition of | f|. To complete 
the proof of (i) we note: (f — fy)*(x) < f*(x) for all x and = 0 if f*(x) < N; 
if fl < @ and p = @ then f = fy for some N; and if |f| < » andl < p< @ 
then A = y{|f(P)| > N} = m|{|f*(x)| > N} ~0as N— @ and 


lf —fxl? = J (f — fv)" (x)’w*(x)dx < f f* (x)? w* (x)dx. 


To complete the proof of (ii) we note: f,*(x) < f*(x) for all x and =0 for 
x > y(e); if 1 < p < @ then 


yle) 


lf? < / f*(x)’w* (x) dx; 


and if p = o, 





fd = ess. sup |f(P)| on e. 


THEOREM 5.5. 

(i) [gw] < [g]; [gw] — [g] as N— ©; N— @ implies |g — gy] — 0 whenever 
[g] < @ af and only if either 1 <q < © or g = @ and ess. sup w(P) on S is 
finite. 

(ii) [ge] < [g]; sup [g.] (for all e) = [g]; if [g] < @ then y(e) 0 implies 
[g-] — 0 if and only if either 1 <q < © org = @ but for some A > 0, y(e) = 0 
whenever y(e) < A. 


Proof. Parts of (i) and (ii) are easy consequences of Theorem 5.1(i). To 
prove the rest of (i) we note: if [g] < ©, and gq = @ and ess. sup w(P) is finite, 
then g = gy for some N; if g = © and ess. sup w(P) is infinite then [w] = 
[w — wy] = 1 for all N; and if [g] < ~ and 1 <q < @ then 

A(N) = 7{|¢(P)|>N}0 
as N — (as is easily verified) and, using Theorem 5.1, we obtain 
7 
lg — gw] < sup u(x)g* (x) dx 
70 


where u(x) is restricted to be non-negative, non-increasing, to have |u|, < 1, 





FUNCTION SPACES 285 


and to vanish for x > A(N). Since v < x° for all v (Theorem 3.7 (ii)), Theorem 
4.1 shows that, for all such u, 


7 7 
le — ev) < sup f'u(x)g*(x) dx = sup f'u(x)o(x) dx, 
0 0 
where o(x) = g*°(x) for 0 < x < A(N), = 0 for all other x. Theorem 4.2 now 
gives 
AW) *°;/.. q 
lg — gv]* < [v]* = f (e)) w* (x) dx. 


. Thus [g — gw] 0 since A[N] — 0 when N > o. 
To prove the rest of (ii) we note: if 1 <q < @, 


lee] < sup fu(x)g*(x) dx = sup f'u(x)g*(x) dx 


for all non-negative, non-increasing u(x) with |u| < 1 and u(x) = Oforx > y(e). 


; Hence 
[g.’ < f (2) w* (x) dx 


which — 0 when y(e) — 0; if g = @, 


J ee) lever) 
el = sup Oe) 


for all e, C e with y(e,) > 0, and hence if y(e) — 0 implies [w,] — 0 there must 
be A > 0 such that y(e) < A implies y(e) = 0. 


THEOREM 5.6. if| < @ implies that for any « > 0, there is an e for which 
f —fd <« if and only if either Case (Ci) holds (1 <p < @) or Case (C3) 
holds with lp < @. 


Proof. In Case (C,), S may be used for e. In Case (C2), with 1 < p < @ 
there is a finite A with 


[Perr (e) dx < te. 
Let e = {\f(P)|? > «/2w*(0, A)}. Then 


i= (f+ f) ¢- sorrento) ae 


€ : f- a 
< aoa A, w* (x) dx + _Fe@) w* (x) dx < «. 


In Case (C:) with p = @, and in Case (C;) with arbitrary p, 1 < »p < @, the 
function f(P) identically 1 has (f — f,)*(x) = f*(x) = 1 for 0 <x < @ and 
hence lf — f,| = |f| > 0 for all e. 











286 ISRAEL HALPERIN 


COROLLARY. |f| < © implies that f = fg with E a countable union of sets of 
finite measure, for all p in Case (C;) and for 1 < p < @ in Case (C;). 


THEOREM 5.7. [g] < © implies that for « > 0 there is an e for which |g — g,] 
< « except for g = © in Case (C2). 


Proof. In Case (C), [g — g.] = O whene = Sforl1 <q < @. 

In Cases (C2) and (C;), if 1 < g < @, Theorem 5.1 (i) implies that for some 
e, [g-]* > [g]* — e* and Theorem 5.2(ii) implies [g — g,] < «. 

In Case (C;) with g = @, 


lg) = sup fouceret te) dx for  “u(x)w*ae <1 
0 70 


with u(x) also restricted to be non-negative and non-increasing. The particular 
choice u(x) = (w*(0, ~))-' for all x (we are in Case (C;)) shows that g*(x) 
has a finite integral on (0, ~); and if, for k > 0, we set e = {|g(P)| > k} then 
A(k) = y(e) is finite. 

Clearly we need consider only the case with A(k) — @ as k — 0. Then choose 
B so large and then & so small that 





= oe i * ° ew* (0, B) 
ro, (x) dx < he; R<—5R Ak) > B. 


2 ° ! ° ° ° ° 
Now, when u varies so that |u|, < 1, with u(x) non-negative and non-increasing, 


B a 
lg -—gel< ap ( f u(x)k dx + f u(x)g* (x) ae) 
70 ~/R 


< k [le.s)] + (sup «(B)) f g* (x) dx 
RB 


K 4e+ fe = 


since [lio,s)] = Bw*(0, B)-' and u(B)w*(0, B) < 1 (use the fact that u(x) is 
non-increasing). 
In Case (C2) with g = @, [w — w,] = [w] = 1 for all e. 


COROLLARY. [g] < © implies g = gz with Ea countable union of sets of finite 
measure except for q = © in Case (C2). 


THEOREM 5.8. 
(i) |igl, = w*(0, y(EZ))!”. 
(ii) [lg], = w*(0, y(E))'*. 


Proof. The calculations are easy. 


THEOREM 5.9. Let ¢a, Eg be families of subsets of S such that for « > 0 and 
given e, E there are sets e’, E’, countable unions of the ea, Eg respectively, with 
vle — ee’) + y(e’ — ee’) < « and y(E — EE’) + y(E’ — EE’) <«. Let T;, 





ne 


x) 
en 


le 


nd 
nth 
T; , 


FUNCTION SPACES 287 


12, T's, Ts be the sets of functions which are constant and rational on each of a finite 
number of the e., Es, arbitrary e, arbitrary E, respectively. Then: 
(i) For 1 ¢ p < @, T, is dense in Ly)? in Cases (Cy) and (C2) and T; is 
dense in Ly)” in all Cases (C1), (C2), (Cs); for p = @, T, is dense in Ly)’. 
(ii) For 1 <q < @, T, is dense in Mig)‘ in all Cases (C;) (C2), (Cs); for 
q = ~, if ess. sup w(P) ts finite, then T; is dense in Mi)* in Cases (Cy), (C3) 
and T, is dense in M,,)* in Case (C2). 


Proof. The proof of this theorem follows from standard arguments using the 
preceding theorems. 


Remark. if S has a countable family of Eg as in Theorem 5.9 with y(Zs) 
finite for every 8, for example if S is a subset of Euclidean space with positive 
Lebesgue measure, then L;,)” is separable in Cases (C;), (C2) for 1 < p < @ 
and has dimensionality equal to the power of the continuum in Case (C;) for all 
1 < p< @, and in Cases (C,) and (C;) if p = @. To prove this, note that in 
Case (C3) if e1, €2,... are disjoint, each of measure > A > 0, then for every 
infinite sequence x of increasing positive integers the function 


f(P) =1 for p € €,, m € x, 


= 0 for all other P, 


is in L, and for different sequences 7, 72, 


| fe. — fe,| > w*(0, A)”. 


On the other hand, with such S, M,,)‘ is separable for 1 < ¢ < @ for all w, 
and has dimensionality the power of the continuum if g = © and w is bounded 


on S. 


6. The conjugate spaces. Let L’, M’ denote the closed linear subspaces 
spanned by the f in L and g in M respectively with y{f(P) # 0}, y{g(P) = 0} 
finite. Then L’ = L in Case (C,) with 1 < p< @© and in Case (C,) with 
1 < p < @; otherwise L’ is not all of L. Also, M’ = M except for g = 
in Case (C.); for this case, M’ is not all of M. 


THEOREM 6.1. 

(i) If 1 <p < @, the conjugate space to L’ is M, assuming, if p = 1, that S 
has property (R). 

(ii) Jf 1 <q < &™, the conjugate space to M’' is L, assuming, if q = 1, that S 
has property (R). 


Proof of (i). In view of Theorem 5.1(i) we need only show that every bounded 
linear functional ¢(f), f € L’ has the form, for some g in M, 


6.1) frPremrarr). 


Now ¢(1,) is defined for all e and ¢(1,) — 0 when y(e) — 0. The Radon-Nikodym 











288 ISRAEL HALPERIN 


theorem then implies: for every E which is a countable union of sets of finite 
measure there is a g(E) = g(E, P) vanishing outside E, with finite integral on 
every e C E and such that 


61.) = f eB, P) arp) 


for every e C E. This implies [g(£)], < '¢). We can suppose £ chosen to give 
[g(£)], its maximum possible value. 

If p > 1 Theorem 5.3 (ii) then shows that g(£;) is the zero function whenever 
E,, E are disjoint, and we set g(P) = g(A,, P) for all P; then 


o1) = fer) avr) 


for all e. Hence ¢ coincides for all f in L’ with the bounded linear functional 
defined by this g through (6.1). 

If p = 1 we use the decomposition of property (R) to define a single g(P) to 
coincide on each é, with g(e., P). For this g we have [g], < | and the argument 
proceeds as before. 


Proof of (ii). In view of Theorem 5.1 (ii) we need only show that every 
bounded linear functional ¢(g), g € M’ is of the form, for some f in L, 


(6.2) SrPre) avr). 


As in the proof of (i), using Theorem 5.3(i), or property (R), we obtain an 
f(P) with |f|, < |¢| and such that 


o(1) = fsPre(P) ar(P) 
for all g in M’. This implies (ii). 


COROLLARY. (Lyu)”)* = Mew*if 1 < p < @ im Cases (C;) and (C2) (assuming, 
if p = 1, that S has property (R)). And M«)* is part but not all of (Ly)”)* in 
Case (C3) for every p. On the other hand, (Miw)*)* = Ly)? tf 1 <q < © in all 
Cases (C1), (C2), (C2) (assuming, if gq = 1, that S has property (R)). 


It follows that Ly)? and M,)* are reflexive if 1 << p < @ (ie, 1<¢q < @) 
in Cases (C,) and (C2), and for p = 1 or © (i.e.,g = © or 1) in Case (C,) if 
S is the union of a finite number of atoms, and that L,,)? and M,,)* are not 
reflexive in all other cases. 


REFERENCES 


1. S. Banach, Théorie des opérations linéaires (Warsaw, 1932). 

2. H. W. Ellis and Israel Halperin, Function spaces determined by a levelling length function, 
to be published in a subsequent issue of this journal. 

3. G. G. Lorentz, Some new functional spaces, Ann. Math., 51 (1950), 37-55. 


Queen’s University 





v 


SUMMABILITY METHODS DEFINED 
BY RIEMANN SUMS 


J. D. HILL 


1. Introduction. Let f(x) be real valued, bounded and, integrable in the 
sense of Riemann on the interval X = (0 < x < 1), with the value of its integral 
over X equal to one. For brevity we call such a function admissible. The symbol 
X? will always denote the interval (k — 1)/n <x < k/n, x} an arbitrarily 
chosen point of X%, and é any specified set of intermediate points 


(xt) ea o.2..., 068060628...) 


If {a,} is a sequence of 0’s and 1’s such that 
_ Ile 
lim -> OQ, = a, 
n 1 k=l 
then it is known [2] that the “pattern integral,’’ defined by 
. Ile 2 
(1.1) lim — >> f (xt) a, 
nso 1 kel 


exists for all choices of 6 and has the value a. 

It is clear that (1.1) may also be regarded as defining a method of summability, 
which we denote by (, f, 8), and in §2 we find the condition under which this 
method includes the method (C, 1) of arithmetic means. In §3, by reinterpreting 
certain results of Agnew and Rado, we call attention to the existence of two 
classes of functions for which (8, f, 6) is equivalent to (C, 1). We conclude with 
a pair of examples, the first of which shows that (, /, 6), for certain f, may 
be definitely stronger than (C, 1) for bounded sequences. 

In terms of the pattern integral, the results exhibit conditions under which 
the existence of the pattern integral implies that the pattern {a,} has a density 
in the sense of (C, 1); and the first example shows that the pattern integral may 
exist without the pattern having a (C, 1)-density. 


2. Inclusion of (C, 1) by (MR, f, 6). In addition to the definitions in §1 we 
need the following facts from the theory of summability. A transformation of 
the form 


n 
(T) Tn = Di Ont Se (n = 1,2,3,...) 
' = 
defines a method of summability by means of which a sequence {s,} is said to be 
summable-T to s if T, ~ sasn — @. If every convergent sequence is summable-T 


Received February 19, 1952; in revised form December 18, 1952. 
289 











290 J. D. HILL 


to its ordinary limit, then T is said to be regular. In order that T be regular the 
following conditions are necessary and sufficient: 


(2.1) lim a,, = 0 (kh = 1,2,3,...), 
(2.2) lim >> au = 1, 
Rac k=l 
(2.3) sup > | Gnz | < ©. 
n k=1 


A method 7; is said to include a method 7, if every sequence summable-7; is 
summable-7, to the same value. If each of 7; and 7; includes the other, then 
they are equivalent. These definitions can be phrased to hold with respect to a 
specified class of sequences. For example, it will be necessary to employ the 
phrase, equivalent for bounded sequences, with its obvious meaning. A more 
restrictive concept than the latter is the following. The methods 7; and 7; 
are said to be absolutely equivalent for bounded sequences if for each bounded 
sequence {s,} the corresponding transforms are related by means of the 
condition 


lim (TS? — 72] = 0. 


mM 0D 


As indicated above, we use the notation (9, f, 6) for the method of summability 
defined by the transformation 


le - 
(2.4) T,== 2 f(x) Sz iw iSS.;.. 3 
ke 


where f is admissible and 5 = (x;) is a given set of intermediate points. If 
f(x) = 1 on X we note that (2.4) reduces to the Cesaro method (C, 1). 


THEOREM 1. For arbitrary 5, and 5, the methods (R, f, 5:1) and (R, f, 52) are 
absolutely equivalent for bounded sequences. 


Proof. Let the set of intermediate points 5, be denoted by (x;,;) and the set 
52 by (x%.2). By a theorem of Cooke [3, p. 105] we have only to show that 


1 = n n 
Da = — Do | f (xis) — f(%2.2) | = o(1). 
nN pal 
But this is immediate. For let Mz = sup f(x) on X%, and m; = inf f(x) on X7 
(e=1,2,...,8;2 = 1,2,3,...). Then 
i< n n 
Dy <= Dd (Mi — mi) = o(1). 
nN k=1 
THEOREM 2. Every method (§, f, 5) includes (C, 1) for bounded sequences. 


Proof. For sequences of 0’s and 1's this theorem is merely a restatement of 
the “principal theorem” in [2]. For arbitrary bounded sequences the proof 
remains the same. 





oF =F =— & 


oh 
ea 


of 
of 


SUMMABILITY DEFINED BY RIEMANN SUMS 291 


In order to discuss the inclusion of (C,1) by (®,/f, 4) in the general case, 
we denote by ¢, the (C, 1) transform, 


of an arbitrary sequence {s,}. Then 


Sy = nl, — (nm — 1)ty-1 (se = 1,2,3,...; t=O) 


and this expression for s, in (2.4) yields 


l = n n < 
(2.5) T, = a 2 Riis) — f (Xe+1) te (n = 1,2,3,...), 
where f(x341) is understood to be zero. 


THEOREM 3. In order that (R,f, 8) include (C, 1) for a given 6 it is necessary 
and sufficient that 


(2.6) sup D> = | f(23) - fetes) | = KG) < @. 


Proof. in the notation above it is clear that the statements “{s,} is an 
arbitrary (C,1)-summable sequence” and “{t,} is an arbitrary convergent 
sequence” are equivalent. Consequently, convergence in (2.4) for every (C, 1)- 
summable {s,} is equivalent to convergence in (2.5) for every convergent {t,}. 
In order that the latter be true it is necessary and sufficient that the matrix 


Bb n n 
(2 Lf (xt) — f(atss)1) 
be regular, and the conditions (2.1), (2.2), (2.3) in this case reduce simply to 
(2.6). 
It seems reasonable to expect that the satisfaction of (2.6) for all 4 can be 
characterized by some simple property of the function f(x). That this is in fact 


the case is shown by the next theorem, the proof of which is facilitated by the 
following lemma. 


LEMMA 1. Jf (2.6) holds for all 5, then sups K(6) < @. 


Proof. Suppose to the contrary that sup; K(é) = + @. Then for each 
i = 1,2,3,... there exists a set of intermediate points 


5, = (xt. +) 
such that K(é,) > i. Hence there exists a sequence of indices {m,} such that 
1 .- mi nmi . 
yom hy = e | f (x's) — f (Xe41, 4) | >4, 
Ni kel 


and it is easily seen that {m,} must contain a strictly increasing subsequence 
{n;,} = {m,}. Let a set of intermediate points be defined as follows: x} is 
arbitrary if m # m, (k = 1,2,...,m); and 











292 J. D. HILL 


Xe! = Xn, (k = 1,2,...,m;; j =1,2,3,...). 
Then (2.6) is evidently violated by this choice of 6. 


THEOREM 4. In order that (2.6) hold for all 6 it is necessary and sufficient that 
the function x f(x) be of bounded variation on X. 


Proof. We first observe that 


(2.7) > . | f(xt) — f(xta) | < > | xe (xr) — xteaf (eter) | + O(1), 


28) Db lat ft) — taste) | < D 2 lsd - set) | + 00, 


where the quantities O(1), entering here and below, are independent of 6. 
Then if x f(x) is of bounded variation on X, we find from (2.7) that 


DF 1s8) — fet) | < Vile f(e)] + 0) = 04), 


where V denotes total variation. This establishes the sufficiency. 

To prove the necessity, let 0 = x» < x; <... <%,_ = 1 be an arbitrary 
partition of the interval X. Fix an integer p so large that at most one of the 
points x, lies in any sub-interval X%, and let 


5 = (x;) 
be any set of intermediate points such that the set (xo, x1, ... , Xm) is contained 
in the set (x4, x2, ..., 25). Then using (2.8) and Lemma 1, we have 


> | Xe-af (x y-1) - x of (x) | < > | xe f (xk) Xe S (xh+1) 


p 
k 
<> > | (x2) — f(x2a1) | + O(1) < sups K(6) + O(1) = O(1) 
k=l 
This completes the proof. 
Combining Theorems 3 and 4 we obtain 


THEOREM 5. In order that (M, f, 6) include (C, 1) for all 6 it is necessary and 
sufficient that x f(x) be of bounded variation on X. 


3. Equivalence of (C, 1) and (, f, 6). For the sake of completeness we 
now wish to point out that results of Agnew and Rado yield two classes of 
monotone functions for which (®, f, 6) is equivalent to (C, 1). It is convenient, 
however, to begin with the following obvious lemma. 


Lemma 2. In order that (R, f, 5) be equivalent to (C, 1) for all 6 it is necessary 
and sufficient that x f(x) be of bounded variation on X, and that the matrix 





‘7 


SUMMABILITY DEFINED BY RIEMANN SUMS 293 


R ren " 
(é (f(x) — flats} 
in (2.5) define a method (R*, f, 5) equivalent to convergence for all 8. 


The next lemma is a result of Rado [5, p. 274] adapted to the present situation. 
Essentially the same result was given earlier by Agnew [1, p. 245]. 


Lemma 3. Jf (R*, f, 5) is regular for a given 5 and if there exists constants 
63; (0 < @ < 1) and N,z > 0, such that 


n—1 
> = Feet) — feet) | <0 | Fl) | (all n > Ni), 


then (R*, f, 5) is equivalent to convergence. 


Using these lemmas we easily deduce the following theorems which are 
essentially contained in results of Agnew [1, p. 251]. 


THEOREM 6. Jf f(x) is non-decreasing then (R,f,5) is equivalent to (C, 1) 
for all 6. 


Proof. To show that the hypotheses of Lemma 2 are satisfied, we first observe 
that x f(x) is of bounded variation if f(x) is non-decreasing. This implies, by 
Theorem 4, that (*, f, 5) is regular for all 5. Turning now to Lemma 3, we have 
to show that there exist constants @ (0 < @ < 1) and N > 0, independent of 4, 
such that 


(3.1) f(x) - : Do f(xt) < 6 | f(x) | (all nm > N; all 8). 


To accomplish this we recall the assumption 


(3.2) fs dx = 1, 


which, together with the fact that f(x) is non-decreasing, implies that f(x) > 0 
throughout an interval X%. Condition (3.2) also implies the existence of an 
integer N > m such that 


+ Ds) >4 


for all n > N and all 5. Now fix 6 (0 < @ < 1) so that (1 — 6)f(1) < 4. Then 
we have 


I< “ 
(1 — 6) f(%») < (1 - 0) f) <8 < FD fed, 
k= 
for all nm > N and all 6, and (3.1) follows at once. 


THEOREM 7. If f(x) is non-increasing with f(1) > 4, then (MR, f, 6) is equivalent 
to (C, 1) for all 6. 








294 J. D. HILL 


Proof. The proof parallels the preceding one except that (3.1) is now replaced 
by 


(33) 1S sled) — fled) < often. 


In this case, since f(1) > 3, we can fix @ (0 < @ < 1) so that g = (1 + 8) f(1) 
> 1. We can then choose N so large that 


for all nm > N and all 6. Then 


+ Dsed) < +6) fl) < +0) $6), 


and (3.3) follows. 

In terms of the pattern integral, Theorems 6 and 7 provide instances in 
which the existence of the pattern integral implies that the pattern {a,} has a 
(C, 1)-density. Such examples were lacking in [2]. 

It is of interest to ask if the restriction f(1) > 4 in Theorem 7 is essential. 
In this connection we have the following example in which f(1) = 3, and the 
theorem fails to hold. 


Example 1. Let f*(x) be defined as § for 0 < x < 3, and as } for} < x < 1. 
Then f*(x) is admissible and non-increasing but (R, f*, 45), which includes 
(C, 1) for all 6 by Theorem 5, is definitely stronger than (C, 1). To prove this 
we consider the sequence 


{az} = (0,0, 0,0, 1, 1,1, 1,0,0,0,0,0,0,0,0,1,1,...), 


composed of groups of 0’s and 1’s, where each group beyond the second contains 
twice as many elements as the preceding group. Let { #7} be the (C, 1)-transform 
of {af} and use the notation ¢*(m) as alternative to &. Then it is easy to see that 
t* (2?) — 4, while ¢*(2**+') — 3%, so that {af} is not summable-(C, 1). 

On the other hand, we can show that {az} is summable-(®, f*, 5) to the value 
}. For let m be given and determine the unique integer 7 = 7(m) such that either 
(a) 22! <n < 2?*, or (b) 274 < m < 2?*'. Then in case (a) we find that 


l< a 
Ts = -). f*(xt) af 
N k=l 
3% me m 3 {in} 204- . 
2 2d, a + 2 i. + 2n x ee 


= BF 2 4 Stn — 2°) + LO — tay 
2n 4=1 2n‘** 2n 








1) 


SUMMABILITY DEFINED BY RIEMANN SUMS 295 


The choise of 6 is obviously immaterial here except in one interval, and this 
interval yields a term o(1) for either functional value. A similar calculation in 
case (b) shows that 7¥ = 4 — (2/m). Consequently, the sequence {aft} is 
summable-(®R, f*, 5) to the value }. 

In so far as the pattern integral is concerned, this example shows that the 
latter may exist without the pattern {a,} having a (C, 1)-density. This question 
was left open in [2]. 

In connection with Theorem 7 and the fact that the condition f(1) > 4 
cannot be weakened, the following example is of interest. 


Example 2. For any a > 1 the function f(x) = a(1 — x)*" is admissible 
and strictly decreasing, with f,(1) = 0. Moreover, it can be shown that (®, fa, 4) 
is equivalent to (C, 1) for bounded sequences. In view of Theorem 1 we can make 
any convenient choice of 6, and we select 6~ defined by 


x, = (k —1)/n (e=1,2,...,n; m=1,2,3,...) 


Then the matrix of (M, fa, 5~) reduces to (a(m — k + 1)*"'/n*), which is equiva- 
lent to the Norlund matrix corresponding to the defining sequence {k*'}. 
Therefore, any bounded sequence summable-(®, f., 5~), say to s, is summable to 
s by the classical Abel method [7, p. 426], and hence summable-(C, 1) to s 
[4, p. 37]. 


4. Some further remarks. One observes that (3, fe, 6~) of the preceding 
Example 2 is equivalent to (C, 2), and this raises the question of the relationship 
between (9, f, 5) and (C, a) in general. In this regard we state without proof 
the following facts. 


(4.1) If there exists a Riemann integrable function f%(x) and a set of sub- 
division points 5, such that (R, f2, 5a) coincides with (C, a) for a > 1, then fz (x) 
is equal to f4(x) = a(1 — x)*"' almost everywhere. 


(4.2) In order that there exist a set of subdivision points 5, such that (R, fa, 5a) 
coincides with (C, a), it is necessary and sufficient that 1 < a < 2. 


(4.3) The sequence {(— 1)*-' k*}, which is not summable-(C, 3), is summable- 
(MR, fs, 8-) to zero. 


A connection between general triangular methods (a,,) and the methods 
(R, f, 8) may be established as follows. 


(4.4) Let (dn) be triangular and regular and let $,(x) = na,, for (k — 1)/n 
<x<k/n (k= 1,2,...,2; n = 1,2,3,...). Suppose that dn (x)| < (x) 
a.e. for all n > N, where (x) is positive and Lebesgue integrable; and that there 
exists a Riemann integrable function f(x) such that $,(x) — f(x) a.e. Then, for all 
5, (MR, f, 5) is absolutely equivalent to (a,,) for bounded sequences. 











296 J. D. HILL 


The conclusion in (4.4) cannot in general be strengthened to equivalence. 
To see this we choose for (a,,) the matrix of (C, 3), so that f(x) in (4.4) can be 
taken as f;(x) in (4.1). The assertion then follows from (4.3). 


REFERENCES 


1. R. P. Agnew, On equivalence of methods of evaluation of sequences, TOhoku Math. J., 36 
(1932), 244-252. 

2. R. E. Carr and J. D. Hill, Pattern integration, Proc. Amer. Math. Soc., 2 (1951), 242-245 

3. R. G. Cooke, Infinite matrices and sequence spaces (London, 1950). 

4. E. Kogbetliantz, Sommation des séries et intégrales divergentes par les moyennes arithmétiques 
et typiques (Paris, 1931). 

5. R. Rado, Some elementary Tauberian theorems 1, Quarterly J. Math., 9 (1938), 274-282. 

6. M. Riesz, Sur l’équivalence de certaines méthodes de sommation, Proc. London Math. Soc. 
(2), 22 (1923), 412-419. 

' 7. G.F. Woronoi and J. D. Tamarkin, Extensions of the notion of the limit of the sum of the terms 

of an infinite series, Ann. Math. (2), 33 (1932), 422-428. | 


Michigan State College 





~ 


ms 


ON THE CURVATURE TENSOR OF EINSTEIN’S 
GENERALIZED THEORY OF GRAVITATION 


K. W. LAMSON 


Introduction. Einstein, in his Generalized Theory of Gravitation {1}, deals 
with a tensor 


Réve =e Roeo: 


whose ninety-six independent components are complex-valued functions of four 
real variables x,. Schouten [4, p. 261 (89)] has decomposed the general relative 
tensor, Vasyrer, antisymmetric in a, 8, y, and in p, ¢, into five irreducible parts. 
This decomposition can be applied to the R of Einstein if we define the tensor 
v by 


» 
(1) VaByper = Vlasyiloe]r = rabyRepe, 


where ¢ is the usual antisymmetric relative tensor with components + 1. 
It is essential that all indices run from 1 to 4. The five parts can be expressed in 
terms of five tensors: a, b, c, d, each with two indices, and 4, with four indices. 
The components of these tensors are linear combinations of the components 
of R and are introduced in (3), (4), and (5) of this paper. 

If the field equations are satisfied there will be conditions on the five tensors 
given by the equality and reality relations (13) and (14). The purpose of this 
paper is the determination of these tensors and the derivation of these relations. 

From the invariance and the uniqueness of the decomposition [4, p. 255] one 
would expect that these five tensors would appear in any discussion of the physi- 
cal meaning of the field equations. 


1. Schouten’s result. This may be stated in terms of the following operators 
ON VaByper: 


Ps, permuting the indices in places 4 and 5, 


n! [ijk ... ], the negative symmetric group on ijk... , 
n! (ijk ...), the positive symmetric group on ijk... , 
where m is the number of symbols ijk . . .. These operators are elements of the 


group ring formed from the symmetric group on six objects [2, p. 72]. 
Schouten’'s final result is that » may be uniquely and irreducibly decomposed 
into the following five parts, 


Received March 20, 1952; in revised form July 20, 1952 
297 








298 K. W. LAMSON 


(2) Vabyoer = ${(1234][56](15) (26) — Pss[1234][56](15) (26) }rasyoe: 1s 
: 

+ ¥ {[1236][45](14) (25) }vasyp0: a 

+ 2{[1234](156) — P4s[1234](156) } vas, er . 

+ 2{ [123][456] (14) (25) (36) }r.sree: : 

+ ¥ {[123][45] (146) (25) }rasyoes ‘t 


The Young diagrams at the right show how the operators are formed. This result 
is taken from [4, p. 261 (89)]. For a detailed account of Young’s work the reader 
should consult [3]. 


2. Decomposition of the tensor R in terms of ninety-six parameters. After 
the operations indicated in (2) have been actually carried out, the tensor v 
is to be expressed in terms of the R’s, from (1). Since (2) is homogeneous the 
decomposition applies to relative tensors with regard to weight. It can then be 
seen that the five parts may be written 


(3) R"se0e = [apedp — Appde] + [b,058] 
+ [cs055 — Capde] + [ersoed™ | + [hsp], 


where the ninety-six parameters are described in the following table. 














TABLE (4) 

Number of 

Tensor Weight Symmetry conditions independent 

parameters 
Aye 0 Ape + ep = (0 6 
by. 0 Doe + Dep = () 6 
Res 0 Coe — Cop = (0 10 
d”’ 1 dv" — d = 0 10 
a 0 hh 1850) = Ni" be0 = () 64 
96 











Here h*s,. is antisymmetric in p and ¢, and the thirty-two conditions on this 
tensor imply that h*.,, = 0. 








GENERALIZED THEORY OF GRAVITATION 299 
The ninety-six equations (3) are solved as follows: 
Ope = — 16 Rage + 3 Ropar — § Roeay 
+ 15 Race — 16 R sae + 16 Rovay 
= § R°sae + § Ro say 


1 paby 1 
a” = is" Rea + ie” R'm, 


> 
b 
ll 


(5) 


These equations show that a, b, and c are tensors and that d is a tensor density. 
From (3) and (5) 4%s,. may be found and seen to be a tensor. 


3. The field equations. Equations in R alone. From [1, equations 24, 13, 
15, 24c] the field equations may be written 


(6) Rep = Teen — Ten Tis — 4 Tans — 4 Vane + Tes Ph. = 0, 
(7) RaBie = Basie ~ £3 a — Bad Ts = 0, 
(8) las — Tee = 0. 


Equation (6) is equivalent to Einstein’s (24), using (8). 
It is necessary to manipulate these equations in order to get equations in the 
R's alone, namely 


(9) R i, = 0, 
(10) Run + Rim = 0. 


To get (9), differentiate (7) with respect to x,, interchange p and ¢, subtract, 
multiply by g** and contract with respect to uw and a, using (8). To get (10), 
use (6), (8), and the definition of R. Substitution from (3) in (9) and (10) gives 


(11) das + 2bag = 0, 
— Baas — 3dag + bas + bas — Bag — 3éap = 0, 
(12) 7 (bas + bas) = 3 (Cas + Cap). 


From (4), dag is antisymmetric and ¢ag is symmetric, and from (11) and (12) 
the result follows. 

If the Einstein field equations are satisfied and if the tensor R is expressed 
in the form (3), then 


(13) as + 2bas = 0, 


(14) Gas, bag, and Cag are pure imaginary. 








300 K. W. LAMSON 


The question as to the sufficiency of these conditions, or whether an alterna- 
tive formulation of the theory could be based on (13) and (14), is left open. 


Added in proof. Since the above was written, the fourth edition of The 
Meaning of Relativity has appeared. The tensor Ry of this edition can be found 
in terms of the invariant parts by contracting equation (3) of this paper with 
respect to a and a: 


Ra = — 3cy = 0, ae = — 5d» 


REFERENCES 


1. A. Einstein, Generalized theory of gravitation (Princeton, 1950), Appendix II: The meaning 
of relativity. 

2. D. E. Littlewood, The theory of group characters (Oxford, 1940). 

3. D. E. Rutherford, Substitional analysis (Edinburgh, 1948). 

4. J. A. Schouten, Der Ricci-Kalkiil (Berlin, 1924). 

5. H. Weyl, The classical groups (Princeton, 1939) 

College of Agriculture and Mechanic Arts 

Mayaguez, Puerto Rico 








A RELATION BETWEEN ULTRASPHERICAL 
AND JACOBI POLYNOMIAL SETS 


FRED BRAFMAN 


1. Introduction. The Jacobi polynomials may be defined by 


ay pie) = aga | - nitatp+n; 1=s] 
l+a; 2 


where (a), = a (a+ 1)... (@ +m — 1). Putting 8 = a gives the ultraspherical 
polynomials P{“ (x) which have as a special case the Legendre polynomials 
P,(x) = Pi (x). 

Now the Jacobi polynomials can be expressed in terms of the simpler ultra- 
spherical polynomials of course by a relation such as 


ps” (x) = > Ce Nand €- 


k=0 
where the c, may be found by the orthogonality properties of the P(x). 
In general, the c, will be very complicated. 


This paper will show that a Jacobi polynomial may also be formed by sum- 
ming across a set of sets of ultraspherical polynomials. 








Let 
fra Fr"). j=0,1,2,..., 
— 1)*(%), (2% 
(2) a ( me 5 )e PEO +E MerO+8) (0), k=1,23..... 
is of k -et!) a—§+1 . i 
Sssme1 on ( 5 \! daa 2 de py +8+1)+k, 4( shite |) 
k=0,1,2,.... 
Then consider the matrix 
(3) Ll fsell, 
where j is the column index and & is the row index, and where j,k = 0,1, 2,.... 
Note that each row, given by fixed & with 7 = 0,1, 2,..., represents a true 


ultraspherical polynomial set multiplied through by a common constant factor. 
The result of this paper is that Jacobi polynomials may be formed from the 
diagonals of array (3), viz., 


(4) PS” (x) = Le fins = Do faa 


Received October 18, 1951; in revised form September 1, 1952. 
301 











302 FRED BRAFMAN 


Note that for the special case, 8 = a, of the ultraspherical polynomials, 
(5) fix = 2a 5 oe 
fio = PY” (x). 


The arrays corresponding to other special Jacobi polynomials have certain 
simplifications. For example, in the array for P@®(x) when 8 = a + 2m for 
m a positive integer, it happens that f, » = 0 for m+ 1 < k, that is, all even- 
numbered rows after a finite number vanish. Similar results hold for 8 = a — 2m, 
B = a+ (2m + 1). 


2. Proof of results (4). First note 


(6) foo = 1 = Po (x). 
For n > 1, the result (4) will be proved in the form 


(0.8). _ pliers). Mars) (a= 8) s ( 
(7) PEP (x) = PS (x) + ) amen 
Let the right side of (7) be denoted by g,(x). The remainder of this section will 
be devoted to showing that g,(x) = P{&-” (x), for m > 1. By splitting the summa- 
tion in (7) into even and odd indices and using definition (2), the reader may 
show easily that 


a a) spaerer. »$(a+8+R)) (x) 








&a(x) = ~ FS sm—4- 
j=0 


Then (4) will be proven once it is demonstrated that g,(x) = P&-” (x) 
By using definition (1), it follows that 
@F*), 2 > (on) tat 6+ n) (52)? 


n! = » H*), j ! 


(8) g(x) = 











+ (5) 5 ea nt Wl + a + B+ WY 
2 




















aS k! (mn — k)! (32), 3! 
~ PR tet+s+sh (1- 2) + | CPG at gt wate sty 
n! =~ (n — j)! CF), 3! 
+(25 a 6) n—j ‘ecm ag) (2tepete) awllt+ea+s+ n) (252) aH. 
p> k! 7! (n _ hb oe 7)! (A), 
Thus 
—1)"(l+a+6+n), (1—<x\" 
9) gle) = MU tet+ sts) (1-2) 





n—1 z—l\j 
where ; 





m., 


ULTRASPHERICAL POLYNOMIAL SETS 303 











so 4 (eRe >. (**}*** + 5), 
+(*5 >> k! (p — kb)! '? 


The functions Z, will be simplified by means of a generating function. From 
(10), form 


et ee 


@ Ca 2 8 . p 
(1) YZe = y Cet Dp 





. (2=£) >>> (23), *H* + 5), 2 t? 
2 


p=l kel k! (p — k)! 


— + a = patie 


om B o ow (c-Fg*), (yt + j) tt" 
gee. 








or 


ee 1 + (1 — gy store 


(12) > 2, 


a—B+2 


a5 _ p-Haeo ee CF), A — 
+( 2 Ja ) 2 Ps : 








in (11), (12), and subsequently, it is intended that the branch of the power of 
(1 — #) is the one which approaches + 1 as ¢ approaches zero. 

Next, by splitting into summations over even and odd indices, it follows 
that 








f~ k! 
=u SCE) (_ wt) _ 5 Ge (_ wy 
= 4 i aa 4 > (4), R! 4 


reo) Ge yale ee] 
= u2F; , "as Gea oF; a,~4a)7 ff: 


where, for 8 = a, the indeterminate form vanishes. Substituting the results of 
(13) into (12) gives 





(14) } Z,f@=-1+(1- tPA Q, 
p=l * 
with 
2,7 ass P a—f+1 146- 3 
- po oe oo er we 
o~ an Tia<al* 2 a-nie > 340-4)" 











304 FRED BRAFMAN 


Using the well-known relation 


(15) oF, (a,b; c; 2) = (1 — 2) oF; (a,c — 6; ¢; 4), 


it follows that 








a- Bp t (2 : ae ‘ [* “pei ( t )'| 
ul 2 tea oF 3 3 \2-8 
or 


@—ur fart (1) 
7) Q= gq — jpre 3;\2-2 





On expansion, 


2-9 | = a= Mel _£_)* 
8) @= f4(ai — aj? 2d. (2n)! \2 —1 








(19) ,. _(2-—#** bo= t )" 
O= Gap L2 Gm! N27 





or 





(20) Q= (—1ree > (a — B)s (. ‘ .) 


n=0 n! 
The series in (20) represents a binomial expansion, and (20) becomes 
(21) Q=(1-#)'**. 
The result of (15)—(21) changes (14) into 


(22) > Z,f =—-1+(1-t)*"™. 


p=l 


Expanding the right side of (22) gives 


(23) S27. F Stethe 


p=l p=l p! 





ULTRASPHERICAL POLYNOMIAL SETS 305 


and thus 


(24) zg, 2 Etetih 


Substitute this in (9) to find 

















— 1)*(1 ,(1-x\ 
(25) ate) « (— 17"( ts + 8+), ( 5 *) 
n—1l z—1\ J . 
(l+a+68+n), ()’ (+a+j),-, 
+ 2, j! (n — j)! 
_ (= 1" (l+at Btn), 4)" 
n! 
(l+a),— (lta+6+n), (3) (—2n), 
+ n! u ji +a); 
= Pr?(x), 


as desired. This completes the proof of (7). 


Wayne University 








A THEOREM OF GLAISHER 
LEONARD CARLITZ 
1. Introduction. Let 


(x — 1)(x — 2)...@@—pt1) = 2" —Aix* +... + Az. 
Then if » is a prime > 3, Glaisher [4] proved 


l l 
(1.1) par — or Bz, (mod P), 
1 2 l 
(1.2) pi Ase — ral — Bo, (mod p), 


where B,, denotes the mth Bernoulli number in the notation of Nérlund; it had 
been proved earlier by Nielsen [5] that the left members of (1.1) and (1.2) are 
integral. 


In this paper we first show that for 1 < r < 3(p — 1), 


2rA ora = -— Lp (p — 2r - 1) Bo, 
(1.3) 


r—1 
e 1 

— (2r + 1)p*>> 4i Bo; Bo,—2; (mod p*); 
i=1 


indeed, a similar but slightly more complicated congruence (mod P‘) is obtained. 
Clearly (1.3) is a refinement of (1.2) and in fact of (1.1) also; alternatively, it 
may be looked on as specifying the residue (mod /) of a certain sum involving 
Bernoulli numbers. 

Glaisher made numerous applications of (1.1) and (1.2); in §§3, 4 we make 
a few additional applications. 

In the remainder of the paper we shall attempt to extend Glaisher’s theorem 
to more general sequences. The generalization depends on the fact that the 


A,, can be expressed in terms of Bernoulli numbers of higher order, namely 
[8, p. 148], 


(1.4) A, = (- yy (? = : B®. 
Hence if 
f(x) = > cq x™/m! (c; = 1), 
m=1 


where the c,, are integral (mod /), and we define 8 by means of 


(1.5) (x/f(x))* = >> pS x™/m!, 


Received January 25, 1952. 


306 








A THEOREM OF GLAISHER 307 


it is natural, in view of (1.4), to seek congruences satisfied by 8%. It will be 
assumed throughout the paper that is a fixed prime greater than 3. 

As we shall see, it is indeed not difficult to generalize (1.1) and (1.2) from 
this point of view. Moreover, by introducing coefficients 7 defined by 


1 ' — . ‘om 
a4) (; ze) = 2m ="/m!, 


where a is integral (mod p), we also generalize certain results of Nielsen analogous 
to (1.1) and (1.2). We remark in this connection that in both (1.5) and (1.6) 
the case k = — pas well as k = > is of interest. 


2. Proof of (1.3). Put 
(2.1) Sn = S,(p) = 1° + 2"+...+ (p— 1)”. 
Then by Newton's formula we have, for r odd, 
r—1 


1A, = > (— 1)'A, Sau, 


t=O 
which we write in the form 


4(r—3) §(r—3) 


(2.2) tA, = Si Aya = S, — A; S,-1 + > Ao: S,21 —_ > A ots) S- 24-1. 
t=1 i=1 


Now by a familiar formula we have for (2.1) 


: 1 m+1 (™ f. , ' 
(2.3) Sn = 2% p> ;  ) Burrs?" 
and this implies for r odd, 3 < r < p, 
(2.4) S, = 4B,3ip (mod p*), 
S,1 = Bip (mod p*). 
Thus by (1.1) and (1.2), 
(7-3) }(r—3) 
De AuS-x = — 2) 9: Bub. tr — 2%) Besa” (mod *), 
t= ve 
4(r—8) i(r-9) 9; 1 - , 
)» Asis , ae = zy ee Bs, p ‘ B,-2-1 p (mod Pp), 
t=] f= 
so that 
§(r—3) “4(r—3) 
u Ax, S,-2: — 2 Asi Sp24-1 
be t= 
(2.5) ir—%) 4 
= —(r+1)p° = Ba, By-24-1 (mod p’). 


f—1 40 











308 LEONARD CARLITZ 


Now by (2.3) we find that 
(2.6) S,— Ai S41 = 49'(r — pp + DB +A P(r — 1) — 2)(r + 2)B,_3 


(mod °). 


Hence combining (2.2), (2.5), (2.6) we get 


rA, — $p(p — 1)A 1 = $0 (r — p+ LB 1 


(2.7) + ap (r — 1)(r — 2)(r + 2)B,_; 
4(r—3) 1 
— (r+ 1)p* >. i Bz, Byes (mod p*). 
i=1 


In the next place it follows from [3, §19] that for? > 1, p = 2m + 1, 
(2.8) — (m — t)/pAa, — Aains = § (p — 2t)(m — t)(m — t + 1)p*or1, 


where o,-has the same meaning as in [3, §12]. Also 


Oy) = Arps (mod p’). 


Consequently (2.8) becomes (7 = 2¢ + 1) 
A, 
which by (1.1) yields 


Il 


sir(r —- lj 7 2) 


(2.9) A, = 3p(p — r)Apa — —a p‘B,_; (mod p’*). 
Comparison of (2.7) and (2.9) now gives 
(7 —1)(@ — +r — 1)A, = — 49° (0 — r)\ (Pb — 7 — 1B 
- r(r — 1)(r — 2)(r* — r—5) 4 
(2.10) A(r — 3) p B,-s 
§(r—3) 1 


— (p—r)(r +1)" > 4; Ba Braet (mod p*). 


In particular (2.10) implies 


}(r—3) 

(2.11) (rf —1)A, = — 49°(p — r)B,1 — rp’* yu q; Bu Brow. (mod p*). 
In can be verified that 

(2.12) A;=—n?'(p—3)—- wp’ (mod p*). 


In view of (2.9) one can specify the residue of A2,, 2 < 2¢ < p — 3, mod p’*. 


In this connection the related formula [7, p. 366] 


1 — 1 
(2.13) p (W, — K,) = W, + du or Bo, Bom—2+ (mod P), p = 2m + 1, 


p(p — r)Apa + ar(r — 1)(r — 2)p°A,-s (mod p*), 





pb’). 


A THEOREM OF GLAISHER 309 


where W, = (Ay-1 + 1)/p, Kp = ki t+... + Rp-1, R(r) = (P?—' — 1)/, is of 
interest. 
Another formula of a similar kind is 


i@—- 


3) 
(p + 2)Byir + ip(P + 1B, = 2» 2D) >, eu ~ Bs, Bys1-2 (mod 9”), 


which is an easy consequence of Euler’s formula 


(2m + 1) Bom + > { + ”) Bs, Bom—2y —_ 0 (m > 1). 


r= 


3. An application. It follows from the definition of A,, that 
(x — 2)(x — 4)... (x — 2(p —1)) = 2" — 24, 2° * +... 4+ 2 "Az: 


if we put x = p = 2m + 1 this evidently becomes 


(— 1)"(1-3-5-...- (p — 2))* = 2 "A,_, — D *pA,_s + 2 pA, 

(mod p*) 
= 2*"(2m)! + 2°"(— § + 12) By» (mod p*) 
= 2"(2m)! (1 + ty p'By-s) (mod p'), 

where we have used 
Ay: = 4p'B,_; (mod p*), 
A,-s = 40°B,_; (mod p’*). 

Thus it follows that 
(3.1) (— 1"(2") = 2°°(1 + w p’B,-2) (mod p*). 


The weaker form of this congruence 


9) « 
(- "(2"*) = 2” (mod p*) 


is due to F. Morley (for references see [2, p. 273]); see also Nielsen [6, p. 81] for 
an equivalent result. 


4. Other applications. Let us take next the familiar quotient 


(np)! _ (p +1)... (2p—1) (2p +1)... @p—1) 


ni(p!)" (p — 1)! (p — 1)! 
(4.1) ((n — 1)p +1)... (mp — 1) 
ie (@ = 1) 


= = Cpa) C52): 











310 LEONARD CARLITZ 


But as Glaisher proved 





(4.2) (¥ ns ') = 1 — $k(k — 1)p°B,-: (mod p*), 
from which it follows that 

(n )! 1 3 
(4.3) sip = 1 — (n° — »)p"B,-s (mod p*). 


The weaker congruence 
a fm = 1 (mod p’) 


is due to Mason and Child (for references see [2, p. 278)). 
We can generalize (4.3) without much trouble. To begin with we replace + 
(4.1) by 


r\! . BF es 
(4.4) (np). = ry (*. ), 


ni(p'!)" £4 


which is easily verified. Secondly, for r > 2, 


Z we - 1). Ty (@ - ne" + ie — 1) /(ie - 1) 
(4.5) , = (* = 0-1 I] ra > — 1): 
But by (4.2) it is clear that (4.5) implies 


(4.6) 0, = O14 (mod p*). 
Thus comparison with (4.4) and (4.3) yields 








ry! 
(4.7) mit, =1-1@' — sR... (mod p*), | 


which is valid for all r > 1. 
We remark that for » = 0, + 1 (mod p), (4.7) becomes 


ff. = 4 
ney tailed 
while for m = n (mod )), 
Amp")! _ (np’)' ‘ 
m!(p"!)™ —_ n\(p"!)" (mod p 4 
5. General sequences. In order to generalize Glaisher’s theorem we take 
(5.1) f=fe)=- T= (c. = 1), 
m=1 mM: 
where the rational numbers c,, are integral (mod p). Now put 
(5.2) 2.3 & (Bo = 1), 


f m=0 m. 





A THEOREM OF GLAISHER 311 


or what is the same thing 
- ~(m _ Si (m = 1), 
-_ 2, ”) cB ~ LO (m > 1), 


thus recursively defining the £,,. Moreover, it is evident from (5.3) that 6,, is 
integral (mod p) for m < p — 1. On the other hand, 


(5.4) pBp1 +c = 0 (mod p) ; 


a somewhat sharper result is 


p—l ( _ 1)’ 
(5.5) PBy-1 _ pd.  - ~ Cf By, + Cp 


0 (mod p’*). 


In the next place, for k > 1, define 


(2) o B® x” 
'  . Bm Xx (k) 
(5.6) ; 2 ~ (6o” = 1), 
so that 8,,°” = £8,,. It will also be convenient to define 6,, by means of 
(5.7) .. f (59 = 1). 
f m=0 m! 
By (5.1) and (5.2), (5.7) implies 
(5.8) bm —_ ys (”") Cr+1 Bu—s . 
r=0 r 

Thus 6,, is integral (mod p) for m < p — 1, while by (5.4) and (5.8) 
(5.9) piri +o = 0 (mod p). 
Indeed (5.8) implies the sharper result 

p-l 
(5.10) bp — Boa = Do ( — 1)" Cr41 Bpat-r (mod p). 

r=1 

We remark that for m = p, (5.8) implies 
= (- 1)’ 

(5.11) é, — Bp = Cori + Ply By-1 — p> > y Cr+ By-y (mod p°); 


r=2 


that 8, is integral (mod /) is clear from (5.3). In fact (5.3) implies, form = p +1, 


(5.12) By + 4c2 PBp+ + ey + pd fee Bri =0 (mod p’). 


For a generalization of the von Staudt-Clausen theorem for the numbers 
8,, see [1]; the same result applies to 6,, also. 


6. Generalization of Glaisher’s theorem. Differentiation of (5.6) yields 


x\* (s)'f = mBy x” 











312 LEONARD CARLITZ 


and thus by (5.7) we get 





(k _ *)6n = m oo BY x™ Cs 5. x" 

kD m! » _ 
This identity is equivalent to 
(6.1) mp + > y Js, pe, = 0. 

r=1 
We take k = p in (6.1) and suppose m < ». It follows at once that 

(6.2) © = 0 (mod p), 1 <m < p — 1, 
while for m = p — 1, 
(6.3) (> — 1) 61 = — Pipi =G (mod 9). 


We shall now sharpen (6.2) and (6.3). 
In the first place (6.1) becomes, for m = p — 1, 


p—2 
(6.4) (p — 1) B21 + Pipa = — pd (? - ‘) 5, By 
For 1 < m < p — 1, (6.1) implies, using (6.2), 


(6.5) mBx + pin = 0 (mod 9’). 
If we substitute from (6.5) in the right member of (6.4), we get 


(6.6) (> — 1) BP + pip = — °F g, by-1-+ 
=p ‘Ft <5, by, (mod p’). 
Similarly if m < p — 1, (6.1) yields 
(6.7) m + pin =p 3 4 a8 PePm . (mod p*). 


If we substitute from (6.7) in (6.1) we get even stronger (but rather complica- 
ted) results. For example (6.4) becomes 


(p _ 1) pe’, + Pby_1 = p* >> (?- on ') 6, bp-1_, 


r=1 r 
(6.4)’ a ay 
ail )» (?- ‘) Pater PBs (mod p* ). 


r=l 


We remark that 
a = — 5, (mod p’). 


7. Special cases. It is of interest to see what some of the above formulae 
reduce to when c,, = 1 for all m > 1 in (5.1). Then in the first place 8, = Ba, 











A THEOREM OF GLAISHER 313 


the mth Bernoulli number in Nérlund’s notation. In the second place, by (5.7), 
(7.1) 5 =}, 6, = B,, (m > 1). 


In particular Bons: = 52n41 = 0 for m > 1. It is also clear from 


(5.6) that 
pe = BY. 


In the next place (6.1) reduces to 


mBY + k>> (- »"(™) B, B®, = 0, 


r=l 


which is identical with [8, p. 146 (83)]. Now in view of 


[~ "(? = ') B® = A,, 


we see that (6.2) and (6.3) become 
A,z=O(l<m<p-1), Apnr=-!1 (mod p). 
Next (6.5) implies for m odd, 1 < m < p, A, = 0 (mod p,), while (6.4) yields 
(pb — 1) Ap1 + pB,1 =0 (mod p*), 


another theorem due to Glaisher [4, p. 325]. We have also from (6.5) for m 
even, 2<m<p-—l1, 


1 1 
_4o™ =o (mod p), 
which is the same as (1.1). As for (6.6), it evidently implies 
<<" 1 
(7.2) (p — 1) Ay1 + PB. = p’ > | Bay By-1-2, (mod p’) 
r=1 


which is equivalent to a result of Nielsen already referred to (see (2.13) above). 
Finally (6.7) yields for m odd, 3 < m < p, 


l m 
pean "Se — i) Br (mod p) 
which is the same as (1.2). For m even, 2 < m < p — 1, we get 
(7.3)  ¢ Am + B.) == (")5 B d p) 
d % mA » PBm) = 2+ Or \2r 2r Dm—2r mod p), 


which seems to be new. For m = p — 1, (7.3) coincides with (7.2). 


8. The case k negative. In (5.6) we assumed k > 1. However the definition 
is valid for negative k also and it is of some interest to consider an application 
for such k. If then we take k = — p, (6.1) implies 


(8.1) mp.” = p>, oy, 5, Bn . 


rl 











314 LEONARD CARLITZ 


Thus corresponding to (6.2) and (6.3) we get 


(8.2) Cc? = 0 (mod p), 1 <m<p-— 1, 
(8.3) (p — 1) A? = 95,1 = —Cc, (mod ?). 
In the next place we have 
p—2 _ 
(8.4) (p — 1) 8-7 — pi. = pd, (? - ‘) 5, Bt, 
and 
~ (8.5) mp” — pin = 0 (mod p*), 1 <m<p—1. 
Substitution in (8.4) yields 
(8.6) (> — 1) Be? — pbs = PD et ee (mod p*); 
similarly, for m < p — 1, 
m—1 
(8.7) mB” — pin = p° p> : " 5, Snr (mod p’*). 


Comparison with ae and (6.8) ih 
(8.8) ; (mbm + Pim) = ; (mBm”’ — Pom) (mod p). 
If we now specialize as in §7, and recall that 


e —1 k\ Sr x” 
Ko) - 2 OES 


we see that 


al—*) ND se tie kr *) mtk m! m+k 
™ ~ a tmk(- N (: —— 
so that BC” /m! is a Stirling number of the second kind. We now have at once 
(8.5’) BS? = 0 (mod p*),1<2r7+1<p-—1 
+ BS” = = Bs, (mod p), 1 <2r<p—1, 
}(p—3) 1 
(8.6) (p—1) BS? — pB,,1=p> >> = Bo, By-1-2, (mod p’) 
rl = 
(8.7’) 3 BS?, = ot Bs, (mod p),1<9+1<p-1. 


Formulae (8.5’) and (8.7’) are due to Nielsen [7, p. 338]. 


9. Generalized Euler numbers. We now briefly consider sequences related 
to the Euler numbers of higher order. Let a be a fixed rational number which is 
integral (mod p) and put 





A THEOREM OF GLAISHER 315 


2 (*%) om , 7 m 
9.1 a. fm X _ af _ tmx 
@.1) (+ af) 2», mi’ 1l+af a= m! 
where f = f(x) has the same meaning as in (5.1). The coefficients n® and ¢,, 
are evidently integral (mod ?). 
If we differentiate the first of (9.1), we get 


(9.2) nes - = &S (™) io 


s=0 


which is analogous to (6.1). In particular for k = p, (9.2) implies 


(9.3) 5a -f. = =- > () Sm—s a?” ’ 
so that 
(9.4) 5m =— Ct, (mod p). 
Substitution of (9.4) in (9.3) now yields 
(bst.-1) «8 () 
(9.5) p ( Nm+1 tn) = p> P Sm—s 5-1 (mod p). 


Now for a = 4 we have [8, p. 143] 


2 k ~ oy x" 
(2.) 2 x 2” m! ’ 


so that 7 = 2-"C,. Also [, = — 2-""'C,, for m > 0, where C,, = Co; 
we recall that C., = 0 for r > 0. We can therefore state the following results as 
special cases of (9.4) and (9.5): 





(9.6) 5c? = Cor, ; +1 = 0 (mod p); 
—! (9, .. 

(9.7) : 1 cw _ Cop-1 = p wail . ' Cor—2» 1 Cop—1 (mod P), 
p p s=1 2s 

(9.8) rs Cha. = — (27 + 1) Coys (mod p). 


These congruences are evidently analogous to Glaisher’s theorem for A2,, A241. 
Finally if we take k in (9.1) negative we get results similar to those above. 
In particular for k = — p, we have 


, l —p) = m . { 
(9.3 ) Seti _ Cm = >» §Sm—s %s 


p r= l 


(9.4’) "nai = tm (mod p), 


(9.5") : (3 tnt + in) => (”") Sa~atie~s (mod p). 
s=l 








316 LEONARD CARLITZ 


Comparison with (9.5) gives 


1/1 fer 
9 “= ( + = .) =~ (2 tt = -) 
(9.9) > \p™ t > \p” ¢ 


Then if a = } we get the special formulae 


(9.6’) 5c” m= = Cons 

1 " — (2r—1 
(9.7’) i(2 Ge = Cart) = > ( = ) Car—ae—1 Coys 
(9.8’) 5 Cth = — (2r + 1) Co 


Formulae (9.6’) and (9.8’) are proved by Nielsen [7, p. 


comparison we note that 


ff * e" «a > (*) $”. 


=o \S 


REFERENCES 





(mod 9). 


(mod p), 


(mod p), 


(mod p). 


292]; to facilitate 


1. L. Carlitz, The coefficients of the reciprocal of a series, Duke Math. J., 8 (1941), 689-700. 


ow 


other sums of products, Quarterly J. Math., 31 (1900), 1-35. 





to modulus p* or p*, Quarterly J. Math., 31 (1900), 321-353. 


. L. E. Dickson, History of the theory of numbers, vol. 1 (Washington, 1919). 
. J. W. L. Glaisher, Congruences relating to the sums of products of the first n numbers and to 


On the residues of the sums of products of the first p — 1 numbers, and their powers, 


5. N. Nielsen, Om Potenssummer of hele Tal, Nyt Tidsskrift for Mathematik, 4B (1893), 








1-10. 

6. Recherches sur les suites régulidres et les nombres de Bernoulli et d’ Euler, Annali di 
matematica (3), 22 (1914), 71-115. 

7. 


Traité élémentaire des nombres de Bernoulli (Paris, 1923). 


8. N. E. Nérlund, Vorlesungen iiber Differenzenrechnung (Berlin, 1924). 


Duke University 





WEIGHTED QUADRATIC PARTITIONS 
OVER A FINITE FIELD 


LEONARD CARLITZ 


Introduction. Using some known results on Gauss sums in a finite field, 
it is shown that the sum (1.3) defined below can either be evaluated explicitly 
or expressed in terms of a Kloosterman sum. The same result applies to the 
more general sum S(a, A, Q) defined in (5.1). The latter sum also satisfies the 
reciprocity formula (5.5). Some related sums are discussed in §§6, 7. 


1. The weighted sum S. Let g = p", where p is an odd prime. Assuming 
a € GF(q), we put 
(1.1) e(a) = 
where 
ta) =ata’?+...+a” ; 
Then as is well known 


Pn (a = 0), 
(1.2) 2 (a8) = \o (a ¥ 0). 
By Xs, de, etc. will be understood summations over the numbers of GF(q). 

Let a, ...,a, be non-zero numbers of GF(g) and consider the sum 
(1.3) S= ' > e(2rsb: +... + 2dzg,), 
a, &.*+...+a,&,* =a 

where a, A; are arbitrary and the summation is extended over all sets £), . . . , t, 
satisfying a:f:? + ...+ a,t,? = a. Using (1.2) we may write 


gS = 2d. Dd ef Blarti +... + a2 — a) + 2rrki +... + 2AAE,! 
Becceete 8 


(1.4) = > e(— af) I > e(a¢" + 2d). 


Now for 8 ~0 we have 


(1.5) ) > e(Bt* + 2dt) = »» col oS ») ) - x’) = d - x’) G(B), 


where 

(1.6) G(8) = > e(Bt") (8 ¥ 0). 
It is known that [2, §3] 

(1:7) G(s) = ¥(8)G(1), G(1) = gv(— 1), 


Received March 17, 1952 
317 








318 LEONARD CARLITZ 


where ¥(8) = + 1 or — 1 according as 8 is a square or a non-square in GF(q). 
We have also 


(1.8) > ¥(B)e(aB) = ¥(a)G(1) 


for all a, provided we define ¥(0) = 0. 
Making use of (1.5) and (1.6), we see that (1.4) becomes 


where for brevity we put 


».? 
-" 


As? 
(1.10) o=n—+...4+ 
a) 


The first sum in the right member of (1.9) vanishes unless all A, = 0. Next 
using (1.7), the second sum in (1.9) becomes 


(1.11) oe #) yas... ax6")G"(). 


Bx+0 


2. Kloosterman sums. To evaluate (1.11) we consider separately s = 2t, 
s = 2t+ 1. For s = 2, (1.11) becomes 


(2.1) q‘¥((- 1)'a...) E of - a6 - 2) . 
for s = 2¢ + 1, we get 
(2.2) g'G(1)¥((— 1)’. ars) (8)¢( — ss 2) 


For a = 0, (2.1) and (2.2) are easily evaluated. For a ~ 0, we define the 
Kloosterman sums 


(2.3) K(a, w) = > eas + *), K(a) = K(a, 1), 
(2.4) L(a,w) = > viee(as + *), L(a) = L(a, 1). 


If w = 0 we have at once 
(2.5) K(a,0) = —1, L(a,0) = G(a). 
We note also that for y ~ 0, 
(2.6) K(ey, wy ') = K(a,w), Ley, wy") = ¥(y)L(@, #). 
In particular for y ~ 0, 


(2.7) K(a,w) = K(aw), Li(a,w) = ¥(w)L(aw). 











f 


ext 


2t, 


he 


9). 





WEIGHTED QUADRATIC PARTITIONS 319 


We can easily evaluate L(a) (compare [4, p. 102]). (For a = 0 it is evident 
from (1.8) that L(0) = G(1).) For a # 0 we have, using (1.5) and (1.7), 


> e(— pt’ + 2t) = G(- a(2) = G(- nwier(2), 


1 
G-1)> H(8)e( a8 + 1) = 2 Le(sla — #) + 28). 
x0 B t “3 
Summing on the right side first with respect to 8, we get 


L(a) = 0 (¥(a) = —1), 
(2.8) (Le) = G(1)(e(2a) + e(— 2a)). 


3. Evaluation of S. We now collect these results. There are several cases. 
Using (1.9), (2.1), (2.2) we have first, fora = w = 0, 


s—1 t-1 t 
_ jg l+q (q— 1)¥((- PF) 8) (s = 2t), 
Gul) $= ouhai 
where 6 = a,...a, and / = 1 if all A, = 0, / = O otherwise. Next if a + 0, 
w = 0, the sum in (2.1) reduces to — 1, while the sum in (2.2) = G(—a). 
Hence 
‘ ._ Sgt — gh *W((— 1)8) (s = 2), 
(3.2) ” lg?) + g'¥((— 1) ad) (s = 2¢+ 1). 
Ifa = 0, w # 0, we get 
_ Sq™'t— qh W((— 1)%) (s = 22), 
63) = (gl + g'¥((— 1) wd) (s = 2¢+ 1). 


For a # 0, w # 0, we take first s = 2¢ + 1. The sum in (2.2) is evaluated by 
means of (2.4), (2.6), and (2.8). We find that 


a1 t t 2 
_Jq L+qv¥((— 1) wd)(e(2y) + e(— 2y)) (aw = 7), 
(3.4) S —_ ee (W(aw) = ~1). 
On the other hand, for s = 2¢ we get 


(3.5) S=q ‘1+ q''¥((— 1) '8)K(a, w). 


Thus the sum S defined in (1.3) has been evaluated explicitly except in the 
case aw * 0, s = 2t; according to (3.5), the value of S depends on the Klooster- 
man sum K(a,w). We remark that if A; = ... = A, = 0 (so that w = 0) then 
(3.1) and (3.2) reduce to the well-known results [3, pp. 47-48] for the number of 
solutions of the equation a:f;? + ... + a,f,? = a. 


4. Bounds for S. In view of (3.5) it is of some interest to find an estimate 
for S that will give some information in that case. If we put 


T(a, 8B) = >> e(at® + 28) (a ¥ 0), 


£ 











320 LEONARD CARLITZ 


we have 


|T(a, B)|? = > e(a(é + n) + 28(E + 9))e(— ay — 26m) 


f.9 


p> e(at*® + 26t) >> e(2akn). 


By (1.2), the inner sum vanishes unless § = 0. Hence 


(4.1) |T(a, B)|* = g. 


Returning to (1.4) we have 


gS = q'l+ 2, e(— a8) |] T (a; B, d4). 
Applying (4.1), this becomes 
(4.2) |S — g*"l| < (¢ — 1)". 


The estimate (4.2) has been obtained without using any property of the sum 
K(a, w). If the trivial estimate |K (a, w)| <q — 1 is used in (3.5), we again 
get (4.2). 

Now it can be shown by elementary methods (compare [4, p. 106]) that 


(4.3) |K (a, w)| < 29°". 
Substituting from (4.3) in (3.5) we find 
(4.4) |S — g*"l| < 2q* (s = 2t), 


which is somewhat sharper than (4.2). If in place of (4.3) we use Weil's result 
[5, p. 207] 


|K (a, w)| < 2q', 
then (4.4) becomes 
(4.5) |S — q*"l| < 2q*4 (s = 2t). 


5. Generalization and reciprocity formula. The results of §3 can be stated 
in more general terms if in place of (1.3) we consider the sum 


(5.1) S ”_ S(a, A, Q) - Pp» e(2Aif; + eee + 2r,£.), 


where Q denotes a quadratic form 


Q(u) = » eg fl ly (as, € GF(q), & = |ay,| ¥ 0), 


and the summation in (5.1) is over all &, such that Q(£:,...,&,) = @. Since a 
quadratic form with coefficients in GF(qg) can be reduced to diagonal form by a 


im 
1in 


lat 


). 





WEIGHTED QUADRATIC PARTITIONS 321 


linear transformation, it follows that the sum (5.1) can be reduced to the form 
(1.3). Thus the A’s undergo a linear transformation; however, the number / 
occurring in the formulae of §§3, 4 will have the same meaning as before, 
namely, / = 1 if all \’s vanish, / = 0 otherwise. 


To compute the number » in the general case we recall [1, p. 140, Theorem 2] 
that a quadratic form 


s+1 


j= Do ater, 
1 
in s + | variables can be transformed into 


s 
A 
a? es 2 
de aigins + 5 X s+1 (x41 = Xo41), 
where A is the discriminant of f and 4 is the co-factor of a,4;, .4:. Applying this 
result to 


Q(é:, ae | E,) + 2 (Ane: + oe + A,£;), 
we evidently get 
A 


QO(éi,...,&) + 5 


where it is clear that 6 is the discriminant of Q and 


11 Ais Ai 
Awlccttt? 
Ost = | 
| Ay sMe OI 
Consequently 
(5.2) oo = QA, ***9 As), 


where Q’(u) denotes the quadratic form inverse to Q(u). The results of §3 have 
been written in terms of 6, the discriminant in the quadratic case. We see that 
all the results of §§3, 4 can now be carried over to the general case and need not 
be restated. 

The following remark may be of interest. Let Ai,...,A, be assigned and 
define \j, .. . , 44 by means of 


(5.3) Ae = Do ayydi. 
j=l 
By a well-known theorem, the linear transformation (5.3) carries Q into Q’, 
that is, 
(5.4) Q(r’) = Q’(A). 


Now we have also 








322 LEONARD CARLITZ 


~ s l , s s 
gS(a,r,Q) =gl+ > f - ap — rh n)) ¥068 )G"(1), 
go 
and for the inverse form 


. 1 00x") )¥@-'s)G"), 


Therefore, by (5.4) we have the following reciprocity formula: 


qS(a, X’, Q’) = gl + e{ - ap 
s=0 


(5.5) S(a, A, Q) = S(a, dX’, Q’). 
6. The sum S;. A word may be added about the sum 


(6.1) S; = > e(Ast’ +... + AaE,). 
a, £,+...+a,f,—e 
Clearly we have 
QSi= >> Dd e(2Blarti +... + ast, — awde(Arts” +... + Asks”) 
E,,..-.€& B 
(6.2) 8 
= > e(— 2ag)[] > e(rdt® + 2a). 
8 iml ¢ 


For simplicity we assume that no A; = 0. Then 


Beles) (-€) 


d e(aAd® + 28a) 


G(A,)e(— Bay A). 


Substitution in (6.2) now leads to 


(6.3) gSi = ¥(A)G"(1) 2) e(— 2a — 6x), 
where 
(6.4) ee post... +>. 
If « = 0, (6.3) becomes 
fs — jo (a 
— = Va WaGe'(1) (a 


For » + 0, 


2 
>> e(— 2aB — Bu) = e(a’*/u) >> A - AC +%) ), 
8 8 


and therefore, 





(6.6) Si = q'¥(— w)G*"(1)e(a*/n) (u # 0). 


By means of (6.5) and (6.6), S; is determined in all cases. We remark that an 
explicit formula for G(1) is available, since it is expressible in terms of an 


ordinary Gauss sum. 


WEIGHTED QUADRATIC PARTITIONS 323 


7. Another weighted sum. Finally we consider the following sum which is 
closely related to some of the results obtained above: 


(7.1) S= ¥ wh)... ve), 


where the summation is over all —; # 0 such that 


(7.2) aut: + Bit: +... +,¢, + 6, =a (a, ~ 0, B, ¥ 0). 
We have 


qs » e(— a8) 2 EO> (ag, + pac")) ve) ..» W(E,) 


II 


d,¢(- ab) [] L(Ga., BB,), 


where L(a, 8) is defined by (2.4). In view of (2.8) we have at once 


(7.3) S§=0 (for all a) 
if ¥(a,B,) = — 1 for at least one value of 7. On the other hand, if a8, = y/, 
i=1,..., 7, then we get, using the second of (2.8), 


(7.4) S=q'G'(1)¥(6... Br) 2) e(— as)v"(8)T] (e(2By,) + e(— 2By,)). 


Thus the sum (7.1) is evaluated in all cases by (7.3) and (7.4). 
Note that if 1 < s <r, and 


S = Do wb)... WE) 


the summation extending over all ¢; ¥ 0 satisfying (7.2), then S’ = 0 provided 
¥(a,8,) = — 1 for at least one value of i < s. 


REFERENCES 


1. M. Bécher, Higher algebra (New York, 1924). 

2. L. Carlitz, The singular series for sums of squares of polynomials, Duke Math. ]., 14 (1947), 
1105-1120. 

. L. E. Dickson, Linear groups (Leipzig, 1901). 

. H. Salié, Uber die Kloostermanschen Summen S(u, v; q), Math. Z., 34 (1932), 91-109 

5. André Weil, On some exponential sums, Proc. Nat. Acad. Sci., 34 (1948), 204-207 


~ ww 


Duke University 








AUTOMETRIZATION AND THE SYMMETRIC 
DIFFERENCE 


J. G. ELLIOTT 


1. Introduction. The fact that the symmetric difference (i.e., ab’ + a’b) isa 
group operation in a Boolean algebra is, of course, well known. Not so well known 
is the fact observed by Ellis [3] that it possesses some of the desirable proper- 
ties of a metric distance function. Specifically, if * denotes this operation, it is 
easy to verify that 


Mi: a+b = O if and only if a = 3, 

where 0 is the first element of the Boolean algebra, 
M2: a+b = b+a, 
M3: (a*b) + (b*c) > (a*c), 


where > denotes inclusion in the wide sense. In this note a + } and ab denote 
respectively the join and meet of a and b. Any binary operation satisfying M1, 
M2, and M3 might be referred to appropriately as an autometric operation, or 
simply as a metric operation. It might be observed that, in the language of Ellis 
[4], these properties make the Boolean algebra into a generalized metric ground 
space. The symmetric difference is at once a group and a metric operation. Our 
first objective in this note is to prove that the symmetric difference is the only 
such operation. We then examine other possible characterizations of the sym- 
metric difference arising from weakening or changing these hypotheses. 

By way of historical summary we observe that Bernstein [1; 2] characterized 
the possible group operations in a Boolean algebra among the class of Boolean 
operations and Frink [5] characterized the symmetric difference, again among 
the class of Boolean operations, as the only group operation over which the set 
product distributes. More recently Helson [7] and Marczewski [8] have charac- 
terized the symmetric difference as the only group operation satisfying certain 
other side conditions. 


2. Metric operations. In this section we designate by * a binary operation 
which is simultaneously a group operation and a metric operation in a Boolean 
algebra, and proceed to identify the operation with that of the symmetric 
difference. 


THEOREM 2.1. The only metric group operation in a Boolean algebra is the 
symmetric difference. 


Received April 6, 1952. This paper is part of a doctoral study. The author is indebted to 
Professor L. M. Kelly for his helpful suggestions and guidance in the preparation of this paper. 


324 








AUTOMETRIZATION 325 


Proof. \f x, y, and z are the sides of a triangle in the Boolean algebra, then 
xty=x+2=y+2.Forx+y> zandx+2> yby M3, and upon adding 
x to each side of each expression, we find x + y >x+zandx+z2>x+4y¥. 
This implies that x + y = x + z. The proof is similar for the other cases. 


Suppose now that a = bc. From M1 and the associativity of *, it follows 
that 0 = a+(bec) = (a*b) *c, and hence a*b = c. Thus if a*b = c, thena*c = b 
and b+*c = a. It follows immediately that 0 +a = a, for if we assume that 0 *a = 5, 
then a*b = 0 by the previous statement. Thus a = } by M1. 

We now show that a*J = a’, where a’ denotes the complement of a. Let 
a*a’ = b, and consider the triangle 0, a, a’. Now a+6b=a-+a’' = I, and 
a+b=a+a’ =I, so J = (a+d)(e' +6) =d. Thus aea’ =], and 
a*I = a’ follows immediately. 

Let x*y = p. We will show that p = xy’ + x’y. From the quadrilateral 
0, I, x, y we see that x’ + y’ = x’ + pand x’ + y' = y’ + p. Hencexy’ = xp 
and x’y = yp. We then have xy’ + x’y = (x + y)p = p, since x + y > p by 
M3. This proves the theorem. 

Noting that no use was made of the identity and inverse postulates of a group, 
we immediately have 


THEOREM 2.2. The only metric semi-group operation in a Boolean algebra is 
the symmetric difference. 
DEFINITION. An operation * is said to be weakly associative if 
a*(a*b) = (a*a) *b. 


THEOREM 2.3. The only metric weakly associative operation in a Boolean 
algebra is ihe symmetric difference. 


Proof. in Theorem 2.1, we note that the full power of the associative law 
was used only to show that if a = b*c then b = a*c and c = a*b. These results 
follow from the weak associative law and the metricity of the operation, for let 
bec = a,a*h = x and a*c = y. Then 


b+(b *c) 


(b +b) *c = Orc 


x = bea 


C, 


b. 


y = cra = cr(c*d) = (crc) +) = Od 


Associativity was used strongly in the preceding theorems, but is not used 
in the following theorem. 


DEFINITION. A quasigroup is a system consisting of a set of elements, together 
with a binary operation which satisfies the law of unique solution. That is, if 
a = b+c and two of these symbols are known, then the third is uniquely deter- 
mined. A loop is a quasigroup with a two-sided identity element. 


DEFINITION. The Ptolemaic inequality holds for a quadrilateral if the three 
products (meets) of opposite sides satisfy the triangle inequality (M3). 











326 J. G. ELLIOTT 


THEOREM 2.4. The only metric loop operation in a Boolean algebra is the 
symmetric difference. 


Proof. Let the loop identity be called e. Since e*e = e by the identity law, 
and e+e = 0 by M1, it follows that e = 0. We now show that a*J = a’. By the 
law of unique solution there exists an element y such that a+*y = a’. Consider 
the triangle 0, a, y. By the triangle inequality we have 


a+y>a and a’+y>da. 
Thus 
aa’ + ya’ > a’a’ and a’a + ya > aa, 


whence y > a’ and y > a. Hence y = J, as the only element which is over both 
a and a’ is I. 

We now show that the Ptolemaic inequality holds for any quadrilateral 
0, J, a, 6. Letting a*b) = x, we have 0O*a =a, 0+] = J, [+a = a’, and I* = B’. 
The triangle inequality for triangle 0, a, b yields a + b > x. Again, the triangle 
inequality for triangle J, a, 6 yields a’ + 6’ > x. Hence (a + b)(a’ + b’) > xx 
or ab’ + a’b > x. The other two cases are proved equally easily. 

Now let a*b = x. We wish to show that x = ab’ + a’b. We have just found 
that ab’ + a’b > x. Consider the quadrilateral 0, J, a, b’. By the Ptolemaic 
inequality, we have ab + a’b’ > (a*b’). Hence 


(ab’ + a’b)(ab + a’b’) > x(a +d’) 


or 0 > x(a*b’), so x(a+*b’) = 0. By the triangle inequality, x + (a+d’) > J, 
thus 
x + (a*d’) = I. 


Hence x’ = a*b’ by the definition of complement. In the same manner we show 
that x’ = a’ +b and x = a’ +b’. From the triangle J, a’, b we obtain a’ + b > x’, 
and from the triangle J, a, b’ we obtain a’ + 6 > x’. Hence 


(a + b’)(a’ + 6) = ab+a'd’ > x’. 


By DeMorgan’s laws, we obtain ab’ + a’b < x. This, together with the previous 
result ab’ + a’b > x, implies x = ab’ + a’b. This completes the proof.! 

By defining 0*a = a’, O*a’ = a, 0+] = I, [ea = a, Tea’ = a’, and asa’ = J 
in the Boolean algebra of four elements 0, J, a, a’, we obtain an example which 
shows that a metric quasigroup operation in a Boolean algebra need not be the 
symmetric difference. 


3. Boolean operations. Bernstein [1] has characterized Boolean group 
operations using a definition of a group which differs somewhat from the one 
now in use in that he did not require that the law of unique solution hold. I am 
indebted to Professor B. M. Stewart for pertinent observations which led to the 
following theorem. This theorem is similar to those in [1]. 


'The referee observes that we need only have assumed a one-sided loop. Indeed, it is also 
true that no use was made of the uniqueness of the solution. 





ler 


ne 


Iso 


AUTOMETRIZATION 327 
THEOREM 3.1. Amy Boolean group operation in a Boolean algebra is an 
abelian group operation, and is of the form 
xey = e(xy + x’y’) + e’(xy’ + xy) 
where e ts the group identity. 
Proof. Since the operation * is Boolean, we may write 
xey = Axy + Bry’ + Cx’y + Dx’y’ 


where A, B, C, and D are elements of the Boolean algebra (cf. [1]). We first 
note that 0*«D = CD, and that 0+C’ = DC, hence D = C’ by the law of unique 
solution. Now D ) = BD and B’ 0) = DB implies D = B’ by the law of unique 
solution. Let us designate the identity element of the group by e. We then have 
e+e’ = e’ by group properties, but from our original relation we find that e+e’ = 
B, hence B = e’. Since 


e = ere = Ae+ De’ = Ae+ Be’ = Ae + ee’ = Ae, 
we have that B’ = AB’. Now 


A’+B = B(A’B’ + AB) + B’AB’ = AB + AB’ =A, 


and 


B«B = ABB + B(BB’ + B’B) + B’ = AB + B’ = AB + AB’. 


Thus B«B = A, and A = B’ by the law of unique solution. Since A = B’ = 
D =e, and B = D’ = C = é’, we may write 


xey = e(xy + x’y’) + e'(xy’ + x’y). 


The fact that x «y = y*x is obvious, since the right-hand side of the above 
expression is symmetric in x and y. 


CoroLiary. The only Boolean group operation in a Boolean algebra with 0 
as the identity is the symmetric difference. 


This result may be weakened slightly to yield 


THEOREM 3.2. The only Boolean group operation in a Boolean algebra such 
that 0) = 0 is the symmetric difference. 


Proof. From x*y = e(xy + x’y’) + e’(xy’ + x’y) we obtain 
0=0°0 =ell =e. 


Noticing that no use was made of the associative law in the proof of Theorem 
3.1, we obtain another theorem. 


THEOREM 3.3. Any Boolean loop operation in a Boolean algebra is an abelian 
group operation and is of the form 


xey = e(xy + x’y’) + e'(xy’ + x’y), 


where e is the loop identity. 











328 J. G. ELLIOTT 


Proof. The fact that the operation is of this form and is abelian is proved 
exactly as in Theorem 3.1. We first show that the associative law holds. Using 
the definition of +, it can be shown in a straightforward manner that 


ze(xey) = xyz + x’y's + xy's’ + x’y2’ 
and that 
(zx) ey = xyz + x’y’z + xy's’ + x’y2’. 


Now, since an associative loop is a group, the theorem follows. 


CoROLLARY. The only Boolean loop operation in a Boolean algebra with 0 as 
the loop identity is the symmetric difference. 


THEOREM 3.5. The only Boolean loop operation in a Boolean algebra such that 
0+ = 0 is the symmetric difference. 


Proof. As is Theorem 3.2, it is easy to show that e = 0. 


DerFIniTion. A binary operation is called semi-metric if it satisfies M1 and 
M2. 


THEOREM 3.6. The only Boolean semi-metric operation in a Boolean algebra 
is the symmetric difference. 


Proof. Since the operation is Boolean, according to Bernstein [1] it is of the 
form 


xey = ([*I)xy + (1 O)xy’ + (0+7)x’y + (0°0)x'y’. 
But since the operation is also semi-metric, we have that 
I+] = 00 = 0, and 0+] = J +0. 
Let 0+] = X. We can determine X by noting that 
[eX = X(IX’ + I'X) = XX’ = 0. 
Therefore J = X by M1, and the theorem is proved. 
4. Other characterizations. Frink [5] has characterized the symmetric 


difference as the only Boolean group operation over which the meet distributes. 
In this section we will not restrict ourselves to Boolean operations. 


THEOREM 4.1. The only semi-metric group operation in a Boolean algebra over 
which the meet distributes is the symmetric difference. 


Proof. \t can be shown that 0 is the group identity. If a, 6, and c are sides 
of the triangle /, m, n, then a*b = c, b*c = a, and a*c = b. This follows from 
the associative law and M1, for 


a+*h = (lem) +*(m*n) = 1+(O*n) = len = « 





1g 


‘a 


ie 


44 


AUTOMETRIZATION 329 


and similarly for the other two cases. We now show that the sum of any two 
sides of our triangle is over the third. 


(a + b)(a*b) = (a + dja + (a + b)b = ad. 


Hence (a + b)c = c, so a+ 6>c, which shows that the triangle inequality 
holds. Thus * is a metric group operation, and is the symmetric difference by 
Theorem 2.1. 

The following example shows that there are semi-metric group operations 
over which the meet does not distribute. In the Boolean algebra of eight ele- 
ments, we define an operation * by the following operation table: 


| 0 a b c a’ b’ c | 
0 0 a b c a’ b’ g I 
a | a 0 b’ c | b C a’ 
b | b b’ 0 a’ c a I ¢ 
c c c a’ 0 b I a b’ 
a’ a’ r b 0 Pa b’ a 
b’ | b’ b a | "4 0 a’ ( 
g c c I a b’ a’ 0 b 
I I a’ c b’ a r b 0 


Here * is a semi-metric group operation, but 


a’(c*a) =a'c’ = b 
while 


a’c*ava=cO=c. 


‘THEOREM 4.2. The only semi-metric semi-group operation in a Boolean algebra 
over which the meet distributes is the symmetric difference. 


Proof. \fa = bec, thena*b = c. For 
0 = a*(b*c) = (a*b) *c 


by the associative law, whence a*b = c by M1. Thus 0a = a, for if O*a = 6, 
then 0 = a+b implies a = b. Now we show that the operation is metric exactly 
as in Theorem 4.1, and Theorem 2.2 tells us that * is the symmetric difference. 


THEOREM 4.3. The only semi-metric weakly associative operation in a Boolean 
algebra over which the meet distributes is the symmetric difference. 


Proof. Since a+*a = 0, it follows that 
0 = 0+(a*a) = (0*2) *a. 


Thus a = 0a by M1. We prove that * is metric as in Theorem 4.1, and apply 
Theorem 2.3 to complete the proof. 











330 J. G. ELLIOTT 


DEFINITION. Let © denote the symmetric difference. A binary operation + 
is said to be quasi-analytical [8] when (a+b) © (c*d) < acc + bod for all quad- 
ruples a, b, c, d of a Boolean algebra. 


THEOREM 4.4 (Marczewski). The only quasi-analytical group operation in a 
Boolean algebra with 0 as the group identity is the symmetric difference. 


Proof. We will show first that a = a. 
a = aX) = (a+) o (a*a~") < aca + Ooa' = a", 
a =a") = (a~' #2) 0(a—'" *u) < aoa! + Ooa = a. 


Hence a > a and a < a“, which implies that a = a~. 

Now a*b = 0 if and only if a = b. For, let a = 6. Then a*a = a+a— = 0. 
Leta*b = 0. Thena+a—'! = Oimplies b = a 
This proves M1. To prove M2, we write 


= a by the law of unique solution. 


(a +b) *(b*a) = a*(b*(b+a)) = a*((b*d) +a) 


= a+(0+a) = ara = 0. 
Hence a*b = ba by M1. 


To prove M3, let a, 6, and c be sides of the triangle /, m, n with a = lm, 
b = mn, and c = l«n. Then 


a+b = (lem) *(m*n) = len = c. 
Similarly a = b+*c and b = a*c. Now 
c= ah = (0%) o (a+b) < Oca + 00%) = a + b. 


Thus M3 is proved, and * is a metric group operation in a Boolean algebra. 
Hence * is the symmetric difference by Theorem 2.1. 


In Marczewski’s proof, he shows first that the operation *« is Boolean. It then 
follows from Theorem 3.1 or Bernstein’s results [1] that the operation is the 
symmetric difference. 


5. Concluding remarks. Many of the foregoing results concerning Boolean 
algebras with metric operations are valid, with obvious modifications, in a 
generalized Boolean algebra, i.e., in a relatively complemented distributive 
lattice with 0. Thus Theorem 2.1 could read: 


THEOREM. The only metric group operation in a generalized Boolean algebra 
is the “‘relative symmetric difference.” 


It would be interesting to know which lattices admit metric group operations. 
It is easy to construct examples of non-distributive modular lattices and non- 
modular lattices which admit such operations. However, it has been shown 
that the only distributive lattices satisfying the descending chain condition 





nm. 


m, 


ra. 


en 


he 


an 
la 
ive 


bra 


ms. 
on- 
wn 
ion 


AUTOMETRIZATION 331 


which admit metric group operations are the Boolean algebras.? Thus, for 
example, the only finite distributive lattices admitting such operations are the 
finite Boolean algebras. Efforts are in progress to extend this result to all 
distributive lattices. Detailed proofs of the above remarks will be found in the 
author's thesis. 

Finally, it has recently come to our attention that our Theorem 2.1 has been 
in essence established by Gleason [6] in a note extending the work of Helson [7]. 





*This result is due to L. M. Kelly. 


REFERENCES 

1. B.A. Bernstein, Operations with respect to which the elements of a Boolean algebra form a group, 
Trans. Amer. Math. Soc., 26 (1924), 171-175. 

2. ——, On the existence of fields in Boolean algebras, Trans. Amer. Math. Soc., 34 (1928), 
654-657. 

3. D. Ellis, Autometrized Boolean algebras 1, Can. J. Math., 3 (1951), 87-93. 

, Geometry in abstract distance spaces (Debrecen, Hungary, 1951), 3. 

5. O. Frink, On the existence of linear algebras in Boolean algebras, Bull. Amer. Math. Soc., 34 
(1928), 329-333. 

6. A. M. Gleason, A note on a theorem of Helson, Colloquium Mathematicum, 2 (1949), 5-6. 

7. H. Helson, On the symmetric difference of sets as a group operator, Colloquium Mathematicum, 
1 (1948), 203-205. 

8. E. Marczewski, Concerning the symmetric difference in the theory of sets and in Boolean 
algebras, Colloquium Mathematicum, / (1948), 199-202. 





Michigan State College 











ON SOME MATRIX THEOREMS 
OF FROBENIUS AND McCOY 


J. K. GOLDHABER anv G. WHAPLES 


1. Introduction. McCoy, following Frobenius, studied a problem which can 
be described as follows. Let k be an arbitrary field, 2° its algebraic closure, and 
W any algebra of m X m matrices over k which contains the identity J. Define a 
canonical ordering to be a set of m mappings A, of I, or of a subset S of Y, into 


k* such that the sequence Ai(A), A2(A),..., An(A), for each A € GS, consists of 
the characteristic values (roots of det(A — xJ) = 0) of A, each with the right 
multiplicity. Define a canonical ordering to be a Frobenius ordering if, for all 
non-commutative polynomials f(x, x2,...,Xm) and all finite subsets A,, 


Ao, . . ~ = of YW, 
(1) AgG(A1, Ao, .-., Am) = f(Ai(A1),. ~~, A(Am)), 6m h,...,% 


Say that & has property F if it has a Frobenius ordering. (Previous authors 
defined F in an apparently weaker fashion, demanding that (1) holds only for 
elements of a fixed system of generators of { rather than for all finite subsets; 
but a simple substitution argument shows that their definition is equivalent to 
ours. Also they assumed & to be algebraically closed.) 

Frobenius [3a] proved that every commutative &% has property F; McCoy 
[5] proved that F is equivalent to the property 


(M) W/rad A is commutative, 


where rad & = radical of % = maximal nilpotent left (or right, or two-sided) 
ideal in &; Goldhaber [4] proved F equivalent to 


(P) For every A, B € YU there is a canonical ordering, possibly defined only 
for A, B, and A + B, such that 


(2) A4(A +B) = d4(A) + d4(B), i=1,2,...,n. 


There is also given in [4] a simple proof of the theorem of McCoy; however, 
the proof of a crucial lemma there (our Lemma 2) is not valid for all m unless k 
has characteristic 0. 

In the present paper we give a simple proof of this lemma which avoids all 
trouble with the characteristic, prove McCoy’s and Goldhaber’s theorems 
without restriction on the field k, and show that if k is quasi-algebraically closed 
(i.e. is not the centre of any non-commutative division algebra) then P can be 
replaced by the weaker condition 


Received March 29, 1952. 
332 





Vy 


1) 


MATRIX THEOREMS OF FROBENIUS AND MCCOY 


~ 
~~ 


(P’) The sum of every two nilpotent elements of % is nilpotent. 


To see that P implies P’ recall that a matrix is nilpotent if and only if all its 
characteristic values are 0. 


2. Equivalence of F, M, P, and P’ for quasi-algebraically closed k. 
Throughout the paper W, k, k*, A, retain the meaning given them in the intro- 
duction. All the algebras used are assumed to contain an identity element, and 
if they are matric algebras of any dimension they are assumed to contain the 
identity matrix of that dimension. (It is quite a simple matter to deduce 
theorems from our work about algebras which do not contain the identity 
matrix, but we omit this as not worth the effort.) 


LemMa 1. Jf K is any field containing k, then rad (K X,%) contains 
KX, (rad YM). 


For the necessary theory of the operation K X, see [1], [2], or [3]. In [1] and 
[3] this operation is called “extending the ground field.”’ Since rad & is nilpotent 
there is an integer m such that every product of m elements of rad W is 0. The 
product of any m elements of K X, rad & is a linear combination, over K, of 
products of m elements of rad YU, and hence is also 0. Thus K X, rad & is a nil- 
potent ideal of KX, &, and Lemma | is proved. 


Lemma 2. If A € Mand N € rad Y and x is an indeterminate, then 
(3) det (A — xI) = det (A + N — xl). 


Let k(x) be the field of rational functions of x over k. The matrix (A — x/) 
has an inverse in #(x)X, YW, and 


(4) det (A +N—xI) = det (A —xI) det (I+ (A — xI)"N). 


By Lemma 1, (A — xJ)~" N is nilpotent, hence it is similar to a matrix with 
zeros on and above the main diagonal, hence the third determinant in (4) 
equals 1 and (3) follows. 


THEOREM 1. For every field k and every matric algebra A, F is equivalent to M. 


McCoy [5] and Goldhaber [4] give proofs of this theorem when & is algebrai- 
cally closed. 

Suppose that & is arbitrary and that Y satisfies M, i.e., U%/rad WU is commuta- 
tive. By Lemma 1, (k°X, %)/rad (k* X, UM) isa homomorphic image of (k*X, W)/ 
(k°X, rad YW), hence k°X, UW has property M, hence it has property F. If we 
identify k*X, M% with an algebra of matrices over k* in the obvious way, then 
is contained in k*X, % and clearly % also has property F. 


Lemma 4. Let U and A* be two matric algebras over k, let A — A* be a homo- 
morphism of UX into A* which maps the identity matrix of U onto the identity 
matrix of A* and has precisely rad UA as its kernel. Then for every A © %& the 











334 J. K. GOLDHABER AND G. WHAPLES 


matrices A and A* have the same set of characteristic values (though not in general 
with the same multiplicities). 


Let f(x) and f*(x) be the minimum polynomials of A and of A* respectively. 
Since f*(A) is nilpotent, f(x) divides some power of f*(x). On the other hand, 
f(A) = 0 implies that f(A*) = 0, and hence f*(x) divides f(x). Our lemma now 
follows from the well-known fact that the minimum equation and the charac- 
teristic equation have the same set of roots. 


THEOREM 2. If k is quasi-algebraically closed, then F, M, P, and P’ are 
equivalent. 


We already know that M implies F, F implies P, and P implies P’, and thus it 
suffices to prove that P’ implies M or, what is the same thing, that not M implies 
not P’. 

Suppose then that &%/rad & is not commutative. Since it is a direct sum of 
simple algebras it must, in view of our assumption on k, contain a simple compo- 
nent which is a total matric algebra of dimension at least two (or order at least 
four) over k. Hence the algebra Y* of Lemma 4 can be so chosen that it contains 
two elements A* and B* (images of elements A and B of %&) which have in their 
upper left hand corners the elements 


oe | com. 4 
me on 


and which have zeros in all other positions. 

A* and B* are clearly nilpotent; Lemma 4 tells us that their inverse images 
A and B are nilpotent (but not in rad %). Since A* + B* is obviously not 
nilpotent, we see in just the same way that A + B is not nilpotent. We have 
proved that not M implies not P’. 


3. Equivalence of F, M, and P for arbitrary k. The next theorem requires 
the more elaborate methods of [4]. 


THEOREM 3. If k is any field, A any algebra over k, then F, M, and P are 
equivalent. 


In view of Theorem 2 and the well-known fact [1; 3] that Galois fields are 
quasi-algebraically closed, we may, and shall, assume that k has an infinite 
number of elements. 

Suppose that % has property P. According to Theorem 6.1 of [4] (the proof of 
which does not require the algebraic closure of k, but only the existence of an 
infinite number of distinct elements in k), 2% has a canonical ordering such that 
for any finite subset A;, Ao,...,A, of A, all a, € k, and all i = 1,2,...,n, 


(5) A> a,A ;) _ a a,r4(A,). 





ral 


ly. 
id, 
OW 
ac- 


are 


ies 
not 
ave 


ires 


are 


are 
nite 


of of 
f an 
that 
| 


MATRIX THEOREMS OF FROBENIUS AND MCCOY 335 


Let A, Ao, ...,A, bea linear k-basis for YW, let t;, to, . . . , tn, X be commutative 
indeterminates over k, and consider the polynomial 
(6) det (>> t,A, — xI) — T] (55 4,\,A, — x). 

r i r 


From (5) it follows that for every specialization of the ¢,; and x into k, (6) is 
equal to zero. Consequently by [6, p. 70] we have 


(7) det (>> t,A, — xJ) = [] (55 1,\,A, — x), 


each side of (7) being considered as an element of the ring & [t;, te, . . . , tm, X}. 
Now form the algebra U* = k*X,% and, as before, consider its elements as 
n X n matrices with elements in k*. The matrices A;, Ao, ..., A, are a k*-basis 
for %*. (7) shows that if we use (5) to define a set of mappings A,, As, . . . , A, of 
Y* into k*, allowing the a, to be elements of k*, the resulting set of mappings is a 
canonical ordering on Y*. It obviously satisfies P. 
By Theorem 2, %* has property F, hence its subalgebra W% has property F. 


REFERENCES 


1. A. A. Albert, Structure of algebras (New York, 1946). 

2. E. Artin and G. Whaples, The theory of simple rings, Amer. J. Math., 65 (1943), 87-107 

3. M. Deuring, Algebren (Ergebnisse der Math., vol. 4, Berlin, 1935). 

3a. G. Frobenius, Uber vertauschbare Matrizen, Sitz. preuss. Akad. Wiss. (1896), 601-614 

4. J. K. Goldhaber, The homomorphic mapping of certain matric algebras onto rings of diagonal 
matrices, Can. J. Math., 4 (1952), 31-42. 

5. N. H. McCoy, On the characteristic roots of matric polynomials, Bull. Amer. Math. Soc., 42 
(1936), 592-600. 

6. B. L. van der Waerden, Moderne Algebra, 2nd ed., vol. 1 (Berlin, 1937). 


University of Connecticut 
Indiana University 











SOME REMARKS ON THE CHARACTERS OF THE 
SYMMETRIC GROUP 


MASARU OSIMA 


Introduction. In [2], we derived some character relations of the symmetric 
group S,. These relations were also obtained in [1] independently. In the present 
paper, we shall study the properties of these character relations in some detail. 
In the last section, using a result obtained in [3], we shall further determine the 
number of modular irreducible representations in a p-block of S,. 


1. We shall denote by [a] the irreducible representation of S, corresponding 
to a diagram [a] of m nodes, and by x. its character. Similarly we define the 
irreducible representation [8,] of S,_,, and its character xs,. We denote by m(n) 
the number of distinct irreducible representations of S,. Then, as is well known, 
the number of classes of conjugate elements in S, is equal to m(n). 

Let Q = A.U be an element of S, where U is a single cycle of length u, and A 
is any permutation on the remaining » — u symbols. By the Murnaghan 
Nakayama recursion formula 


1.1 x%a(A.U) = » as, xs,(A). 


Here, 
de, = (— 1)" 
if a diagram [8,] of S,_, is obtainable from [a] by the removal of a single u-hook 
H, with leg length r,, and 
aes, = 0 


otherwise. We set 
(u) 
1.2 ta = Do Gas. X8.- 
Bs 


u™ is called the (generalized) character of S,_, corresponding to xa. 
Let Ai, Az,...,Amam—y be a complete system of representatives for the 
classes of conjugate elements in S,_,. If we set 


1.3 Z = (xa(A;U)), 
then 
1.4 Z'Z = (n(A,.U)b4;), 


where Z’ is the transpose of Z and u(A,.U) is the order of the normalizer 
N(A,.U) of A,.U in S,. Since we have from (1.1), 


Received April 7, 1952. 
336 


ok 


he 


er 





CHARACTERS OF THE SYMMETRIC GROUP 


1.5 Z = (das,)(xs,(Ai)) = (Gas.)Zp,, 

(1.4) gives 

1.6 Z’ (aas,) Zs, = (n(A,.U) 8,,). 

Hence, if we set 

1.7 piv(AU) = Y aag,xe(AeU), X = (p§2(AeU)), 
then (1.6) becomes 

1.8 X’Zs, = (n(A,.U) 545), 

that is, 

1.9 > Pix (AU) x8,(As) = (AsV) Buy 


If an element P of S, possesses no u-cycle, then by [2] 


1.10 ps. (P) = 0. 
We shall call 


pa. = >> das. Xe 


337 


the (generalized) character of S, corresponding to xs, of S,_,. If we set T = 


(n(A,.U)é,,), then from (1.8) we have 


TX’ Z, = I, 


where / is the unit matrix. Since X and Z are square matrices, 


7x" = f. 
Then, from 7-' = (g(A;.U)6;;)/n! we have 


Zs.(g(A«U) 545) X’ = 


| 
-~ 
2 
o 
© 
~F 


which may be written 
Lil E (AU) ple U) x44 (Ad) = 4 


where g(A,;.U) = n!/n(A,U). 
If A, possesses ¢ u-cycles, then we have generally: 


1 
g(A,.U) = pa (number of conjugates of U in S,) 


X (number of conjugates of 


In case [8;] is the l-representation of S,_,, (1.11) becomes 


for [8,] = [63], 
for [8,] # [6:], 


A, in S,_,). 


fn! for the 1-representation [8,], 


L12 Do g(A«U) 082 (AeU) = 2 8? (HL) = 19 otherwise. 
‘ 











338 MASARU OSIMA 


Here, H ranges over all elements of S, which possess at least one u-cycle. 
THEOREM 1. If Q is an element of S, with t u-cycles, then 


ps. (Q) = tu xs,(Q™) 


where Q™ is a permutation on the n — u symbols obtained from Q by the removal of 
a single u-cycle. 


Proof. Let Q™ be conjugate with A,. Then 
ps. (Q) = p5.(A..U). 
If we denote by n,(A,) the order of the normalizer N,,(A,) of A, in S,_,, then 
1.13 > x8(A +) xs.(As) = mu(A 4) 545. 
Since n(A,;.U)/n,(A,) = tu, we obtain 
> tu xp.(Ai) xs.(A;s) = n(AzU) 84. 


This, combined with (1.9), gives 


(u) 


ps. (A;y.U) = tu xe,(A;), 


whence 
ps. (Q) = tu xs,(O™ 
If we set 
1.14 (bs.e.) = (das,)’ (Aas,) 
then 
bs.s. = > ep ab, 
and 
1.15 ps. (AU) = > bss. xs. (A.). 


THEOREM 2. Jf A,.U possesses t; u-cycles, then 
lba.s. | => ” peated | ty. 
Proof. From (1.6) and (1.13), we have 


Z5,(das.)’ (das.)Zp, = 25, (bp.8,)Zs, = (m(A,.U) 545) 
and 
Z3. Zs. _ (n,(A ;) 5;;). 


Hence 





bss. 


= I] n(A,.U) I] n.(A,) = I] (ta) = u™ a bo. 
i i 


a 





A we 








ae, 


CHARACTERS OF THE SYMMETRIC GROUP 339 


Let A = B.V be an element of S,_,, where V is a single cycle of length » 
(v # u) and B is any permutation on the remaining n — (u + v) symbols. 
We shall denote by [8,,,] an irreducible representation of S,—~+,). Then, for the 
character uy” of S,_~w+») corresponding to xs,, we have 
1.16 xs.(B.V) = up'(B) = 2) 05.8.4.Xs000(B). 


ute 


u) 


+ 


THEOREM 3. Let ps , be the character of S,, corresponding to xe,,,- Then 


ps.(Q) = > 06.8.4P8..»(0°”), 


where Q is an element of S, with at least one v-cycle and Q” is a permutation on 
the n — v symbols obtained from Q by the removal of a single v-cycle. 


Proof. For Q without u-cycle, we have by (1.10) 


ps. (Q) = 0, ps...(O°”) = 0. 


For Q with / u-cycles, we have by Theorem 1 


po. (Q) = tu xs.(Q™), pinr.(Q) = tu xs.+.(0"") 


where Q = OM.U = QOm.V = Om. U. i hs Oli Boe (1.16) “a 
p> 05.8.4 .P3.+.(Q”) = tu>. WBh040Xture(O”) 


= tu us. (Q™") = tu x9,(0™) = pp2(Q). 

2. We shall consider the character of a representation [a] for an element 
Q = B.V.U. where U, V are cycles of lengths u,v (u # v), and B is a permutation 
on the remaining m — (u + v) symbols. Applying the Murnaghan-Nakayama 
recursion formula twice, we obtain 


xa(Q) = p> 4a8,X8.(B.V) 


2. ap. > 43.8.4. X8.+.(B) 


and 
xa(Q) = > Gop, X8,(B.U) = > as, Du 18.8.+.X8.+.(B). 
Here, [8,], [8], [8.42] are representations of S,_,, S,—,., Spas») respectively. 


Then it follows that 
2.1 i 48.8.+.4a, = > 18,85+-Tabss 
Be Be 


that is, in matrix form 


2.2 (Gas.) (Gp.6.4.) = (Gas, ) (Gs,5.+.)- 
We set 
23 (acs...) = (Gas,) (Gs..+.) 








340 MASARU OSIMA 


‘Then we can define the character 


\u, ©) (¢.u) 


Ma = Ma 
of S,-(a+») Corresponding to x. and the character 


(u.o) (9, m) 
PBute = PBu+- 


of S, corresponding to xs . as follows: 


P (uw. ®) * 
2.4 Ha ~ p DaBu+s XBu+e1 
Bute 
7 (u, ©) * 
2.5 fur. ™ } das.+, Xa: 
a 


The character p¥.” is called the character of type (u,v). Equation (2.1) shows 
that 
‘ (u. 2) u) (e) 
2.6 pac” = > as.s...0. = >, Gp,6.+.Po.- 
Bs Be 
Generally we can define by the same way, the character 


(u, 0....,@) 


PBu+0+... +0 
of type (u,v,...,w) of S, corresponding to 

XButot...t+e 
Of Sy—ts+...¢u)- Let Gi, Ge,...,G, ( = m(n — (ut+u+...+w))) bea 
complete system of representatives for the classes of conjugate elements in 


Sn—(ute+...+e)- Corresponding to (1.10), we can prove by the method used in 
{2], the following 


THEOREM 4. Jf an element P of S, is not conjugate toG;W...V.U (i = 1, 
2,...,2), then 


Pa ") (P) -_ 0. 
3. Let p be a rational prime and let 


3.1 n=r-+ wh, O<r<p. 


A p-singular element of S, has at least one cycle of length p or a multiple of p 
while a p-regular element is simply a permutation, the lengths of whose cycles 
are all prime to p. If a p-singular element P of S, has only a \p-cycle as cycle 
of length a multiple of p, then P will be called an element of type (A). Generally 
we may define by a similar way an element of type (Ai, As. ,..,A,) where 
Ai < Ae <... <A, and > A; < w. We denote by D(x, As, . .. , A,) the number 
of classes of conjugate elements in S, which contain the elements of type 
(Ar, Az...» Ag). If 


bg(q+1)<w< $(¢+ 1) +2), 








ws 


——yr- 


CHARACTERS OF THE SYMMETRIC GROUP 341 
then the maximal value of ¢ which satisfies A; < Ax <<... <A, SA, <w 
is g. We set 
3.2 a eer A) =h 

As<As<... 
and 
4 
3.3 > hi = ky Sum t&...ch 
t—{ 


Denote by m’(n) the number of p-singular classes in S,. Then we see easily 
that 


3.4 m'(n) = ky. 

Let m(n) be the number of classes of conjugate elements in SS, as in §1. We set 

3.5 > m(n— (itrArt+...+A)p) =s Se £8... 
Aa Ag<... Ar 


Then (3.2) and(3.5) yield 
3.6 set het (hai +... + (DA, (t= 1,2,...,@). 
We obtain readily from (3.6) 
3.7 he=s,— ()suait...+(- 1) (Os, Go 82... <c 
THEOREM 5. Let m’(n) be the number of p-singular classes in S,. Then 
m'(n) = 8, — So +53 —...+ (— 1)*"'S,. 


Proof. From (3.6) we have 


¢ 


$1 — So + S3 —- re 5 1)* "Se - > (i) _ (3) + (= 1)* ‘C))h 


t=1 


COROLLARY. 
Se — Sats —... + (— 1)%, = Do ke 
Proof. 
So— Sat sy—...+(-— 1)'s, = 51 - Lh 
= hy + 2h3 + 344+ ...4+ (¢ — IA, = De, 
4. Let A, (¢ = 1, 2,..., 2) be positive integers such that A; < A, « <i 


and u = > A, < w. In the following we shall denote by 


x!" (¢ = 1, 2,. m(n — up)) 











342 MASARU OSIMA 


the characters of distinct irreducible representations of S,_,,, and by 


AaAw...Ae 
Pi 


the characters of type (A1p, Aep, .. . , Ap) of S, corresponding to x“ of S, 1. 
If P is not conjugate to 


V.P,.P,,... Pr, 


where P,, is a cycle of length A,p, and V is any permutation on the remaining 
n — up symbols, then we have, by Theorem 4, 


4.1 po ™ (P) = 0. 
Further, if V is a p-regular element of S,_,,, then 
4.2 pe" “*(V.Py,.Pa, ...Pa,) = Arde... AD'xM(V). 
In particular, we obtain 
THEOREM 6. Jf His a p-regular element of S,, then for any type (Ax, dz, - - - Ap) 
py (H) = 0. 


Lat Ps, Ps, ...; P(x) be a complete system of representatives for the p-singu- 
lar classes in S,. If we set 


1.3 Ri = (p:(P;)) 


(j, row index; A, 2, column indices; where A = 1, 2,...,w; t= 1, 2,..., 
m(n — rp); j7 = 1, 2,..., m’(n), then R; is a matrix of type (m’(n), s:) and 
we have proved in [2] 


4.4 r(Ri) = m'(n) = ki, 
where r(R,) denotes the rank of R;. Generally we set 
4.5 R, = (p:°""* (P,)) 


(j, row index: (Ai, Az, ...,A,), 7, column indices). Then R, is a matrix of type 
(m'(n), s,) and we can prove, as in [2], the following 


THEOREM 7. Let r(R,) be the rank of R, Then r(R,) = k, where k, is the 
number defined in (3.3). 


5. Let [ao] be a p-core of S,_.». Then [ao] determines uniquely a p-block 
Bla] of S,. We call u the weight of Bla]. As in [2], we define /*(u) by 


5.1 [*(u) = > m(v )m(v,)... m(v»_1), 

where the », are the positive integers or zero, and the summation extends over 
all sets (v1, v2,...,¥%-1) which satisfy >> »; = u. Let c(m) be the number of 
p-cores of m nodes. We set c(0) = 1. Then we have by [2] 





ng 


nd 


ck 


er 
of 


-- 


CHARACTERS OF THE SYMMETRIC GROUP 343 


w 
5.2 m*(n) = 7 c(n — up)l*(u) 

u=0 
where m*(n) is the number of p-regular classes in S,, i.e., the number of modular 
irreducible representations of 5S,. 


THEOREM 8. The number of modular irreducible representations in a p-block 
of weight v is I*(v). 


Proof. By [3], the number of modular irreducible representations in any 
p-block of weight v is independent of the p-core. Hence we denote this number 
by f(v). We have 


5.3 m*(n) = > c(n — up)f(u). 


Since /*(0) = f(0) = 1 and /*(1) = f(1) = p — 1, we assume that /*(u) = f(u) 
for u < v. We set n = vp in (5.2) and (5.3). Then 


e e—1 


m*(n) = 2° c(vp — up)i*(u) = I*(v) + 2. (up — up)i*(u) 
and 

m*(n) = > ctep — up)f(u) = f(v) + > cop — up)f(u). 
By our assumption, /*(u) = f(u) (wu = 1, 2,...,»— 1). Hence we obtain 
i*(v) = f(v). 


REFERENCES 
1. J. H. Chung, Modular representations of the symmetric group, Can. J. Math., 3 (1951), 
309-327. 
2. M. Osima, On some character relations of symmetric groups, Okayama Math. J., 1 (1952), 
63-68. 
3. G. de B. Robinson, On the modular representations of the symmetric group, Proc. Nat. Acad 
Sci., 87 (1951), 694-696. 


Okayama Unwersity 





s=s Ss = 








ON $-ADIC INTEGRAL REPRESENTATIONS OF FINITE 
GROUPS 


JEAN-MARIE MARANDA 


1. Introduction. It has been shown by Diederichsen [2] that for integral 
representations of a finite group, the irreducible constituents in any complete 
reduction are not necessarily unique up to order and unimodular equivalence. 
In this same article, it is shown that for certain finite groups, such as the cyclic 
group of order 4, there are infinitely many classes of indecomposable representa- 
tions under unimodular equivalence. 

A natural method for studying these problems of arithmetical representation 
theory would be the $-adic approach, and as a first step in this direction, using 
the methods of Hensel and of Brauer and Nesbitt [1], we shall show that the 
theory of representations of finite groups, over a ring of $-adic integers can 
always be brought back to the modular case, in so far as it is concerned with 
questions of unimodular equivalence, reduction, and decomposition. 

More particularly, we shall show that for any finite group, if $ is a generator 
of the maximal ideal in the ring of -adic integers considered, and if $** is the 
highest power of $ dividing the order of the group, then unimodular equivalence 
may be considered modulo $*, for any k > ko, while unimodular reduction and 
decomposition may be considered modulo [*, for any k > 2k», without any loss 
of generality. 

As a corollary, we shall show that if $ does not divide the order of the group 
then all questions of unimodular equivalence, reduction and decomposition are 
completely equivalent to these same questions modulo §. 


2. Modular binding systems.' Let © be a commutative ring with a 
1-element and let 4% be an ideal of D (possibly the null ideal). Let $ be a hyper- 
complex system over D and let I and A be two Y-modular representations of §, 
by matrices with entries in ©, of degrees m; and mz respectively. 

We shall consider the Y%-modular representations of $, by matrices with 
entries in D, having I as a top constituent and A as the corresponding bottom 
constituent, i.e. 


: 220 = (R42) 


where 6(x) = 0 (mod 4%) for all x € §. Since D is an U-modular representation 
of §, the following laws must hold: 


Received August 13, 1952. This research was done with the aid of a grant from the National 
Research Council of Canada. The author wishes to thank Professor H. Zassenhaus for suggest- 
ing the problem and for his constant guidance. 

1For an account of the theory of ordinary binding systems, see [4, pp. 276-279; 2, pp 
364-374]. 


344 





mn 


al 


REPRESENTATIONS OF FINITE GROUPS 345 


D(x + y) = D(x) + Dy) (mod WM), 
D(cx) = cCD(x) (mod W), 
D(xy) = D(x)D(y) (mod %), 


for all x, y € § and all c € ©. From these laws one deduces the following: 


A(x + y) = A(x) + A(y) (mod %), 
(2) A(cx) = cA(x) (mod YW), 
A(xy) = T(x) A(y) + A(x) A(y) mod %). 


Any system A = { A(x)} (x running through all the elements of ) of 2; X nz 
-matrices obeying the laws (2) will be called an %-modular binding system 
determined by the representations [ and A. Evidently, any such system deter- 
mines an W%-modular representation of § of the type (1) for any choice of @, 
provided @(x) = 0 (mod %) for all x © §. Because of the linearity of the con- 
gruences (2), it is easily verified that the set S(T, A, %) of all A-modular binding 
systems determined by T and A is an D-module under the following operations: 
(3) f(A + A’)(x)} = [ A(x) + A’(x)}, 

{(cA)(x)} = {cA(x)}, 


for all x §, alle € , and all A, A’ S(T, A, M). 
Two %-modular representations of the type (1) 


I(x) Ads)) 
= = 1.2). 
Di(x) hoe A(x) “ *) 
where 0,(x) = 0 (mod 4%) for all x © §, are said to be “strongly”’ equivalent if 


there is a matrix 


le, © 
p-(™ 7), 


where /,, and /,, are the unity of matrices of degrees m, and mz respectively, 
and where 7 is any m; X m2 O-matrix, such that 


PD, (x)P = Do(x) (mod Wf), 
for all x . Since 
a: 3 -T) 
wok 7E) 
this implies that 
(4) A(x) = A(x) + (rer = Ta(z)) (mod %), 


for all x € ®. 











346 JEAN-MARIE MARANDA 


Conversely, if A; and A: are two U%-modular binding systems in G(T, A, 4), 
for which there exists an m,; X m2 D-matrix J such that the condition (4) is 
satisfied, then evidently, the &%-modular representations of the type (1), D: and 
De, determined by A; and A: respectively, are strongly equivalent and the 
transforming matrix is P as given above. 

If the condition (4) holds for the binding systems A; and Az, they are said to 
be strongly equivalent. 

It is easily verified that the set Bo(T', A, M1) of all A-modular binding systems 
A € S(T, 4, Y%, which are strongly equivalent to zero, i.e., for which there 
exists an m, X mz D-matrix T such that 


(5) A(x) = I'(x)T — TA(x) (mod %), 


for all x € §, is an O-submodule of S(T, A, %). Evidently, the statement 
A © %,(T, A, M) means that the representation 


_ (TT) AG)) 
*—-D@) = (re A(x)/’ 


where 6(x) = 0 (mod Y) for all x € §, is fully reducible,? modulo Y. 

From now on, we shall always suppose that § is the group algebra of some 
finite group of order N, and we shall confine outselves to the %{-modular 
representations of § which map the unity element of G onto the unity matrix, 
modulo &. The fundamental theorem is the following: 


THEOREM 1. For any binding system A © S(T, A, %), 
N- A € B(T, A, M). 


Proof. Let x and y be any two elements of and let A be any binding system 
in S(T, A, M). Then 


A(xy) = I(x) A(y) + A(x)A(y) (mod %), 
A(xy)A(y") = T(x) A(y)A(y™") + A(x) (mod &), 

and therefore 
A(x) = A(xy) A((xy)"*)- A(x) — P(e): A(y) AG™) ~— (mod &). 


Now let y run through all the elements of @ and sum: 


N- A(x) = D> A(xy) A((xy)*)- A(x) — P(x): YS A(y) AG"). 


” 








*In this article we will use the expressions complete reduction and complete decomposition 
of a representation to denote a reduction of this representation into its irreducible constituents 
and a decomposition into its indecomposable components respectively. The term full reduction 
will be used to denote a reduction which can be transformed into a decomposition where the 
components are equivalent to the constituents of the given reduction. 





it 


Vv. = fF «& 


REPRESENTATIONS OF FINITE GROUPS 347 


Since, as y runs through all the elements of G, xy also runs through all the 
elements of G, we may write 


N- A(x) = >> A(z) A(z") - A(x) — P(x): } A(z) Ae). 


z 


Setting T = —>- A(z)A(z~'), we obtain 
N- A(x) = ['(x)T — TA(x) (mod Wf) 


for all x € @. Since this last condition is a linear congruence, it will also hold 
for all the elements of , so that 


N- A(x) = ['(x)T — TA(x) (mod %) 
for allx € §, i.e. N-A € B(T, A, WM. 


3. On the connections between [-adic integral and modular representations 
of a finite group. From now on, we will suppose that the ring © considered 
in the preceding section is the ring of B-adic integers of some P-adic field* K. 

Let $ be a generator of the maximal ideal of D and let 8 be the highest 
power of $ dividing N. N is therefore a unit times $**. 

THEOREM 2. Jf I and A are two ordinary Y-adic integral representations of 


, then T and A are unimodularly equivalent if and only if they are unimodularly 
equivalent, modulo $*, for any k > ko. 


Proof. \if T and A are unimodularly equivalent, modulo $*, then they must 
be of the same degree (m,; = nz = n), and there must exist an » X m D-matrix 
T such that 
(6) ['(x)T — TA(x) =0 (mod $*) 
for allx € Sand |7| # 0 (mod $). We will now apply the theory of the preced- 
ing section in the case that & is the null ideal (0), since [ and A are ordinary 
$-adic integral representations. Then congruence modulo & just means equality. 

From (6), the matrices 


rio(Z) - (5) ace 


are integral for all x € §, and one can easily verify that the set of all these 
matrices is a binding system in 8(T, A, (0)). (This is not necessarily true in the 
case that T and A are modular representations.) Then, by Theorem 1, 


w{r«() - (Z) ace’ E BolT, A, (0)), 


i.e., there exists an m X nm D-matrix T”’ such that 


w(re(Z) - (Z) a(e)) = ['(x)T’ — T’A(x) 


*For an exposition of the theory of fields with valuations, see [3, chap. X] 











348 JEAN-MARIE MARANDA 


for all x € §. But since N = u $*; where wu is a unit of O, we may write 


r)( a2.) _ (oF) A(x) 


for all x € $, where 7” = u~' T’ is an n X » O-matrix. Then 


(x)T” — T” A(x) 


l'(x)(T ioe oF”) = (T ad rrr A(x) o- 0 


for all x € § and setting 7* = 7 — B*** T”’, we obtain 
(x) T* — T*A(x) = 0 


for allx € §, and 7* = T (mod $*-**). Since k > 0, this implies that 7* = T 
(mod $), and therefore \T*| = |T| #0 (mod §&), so that 7* is unimodular. 
The converse is immediate. 


THEOREM 3. If D is an ordinary -adic integral representation of $ of degree 
n and tf there is an n X m D-matrix Q such that 


rome i. A(e)) 
Qd)0 = (Te) A(x))’ 


for allx € $, and |Q| # 0 (mod §), so that T and A are modular constituents of 
D, modulo $*, of degrees n; and nz respectively, then there is ann X n D-matrix M 


such that 
7 _ (r*(x) A(s)) 
M“"D(x)M = ( A*(x))” 


for allx € $, and | M| # 0 (mod 8), where the ordinary constituents T* and A*. 
of degrees n, and nz respectively, are such that 


I*(x) = I(x) 


kk 
A*(x) = A(x) (mod $ 


foralix € ®. 


Proof. Let De = G'DQ and assume that we have determined a finite 
sequence of m X nm D-matrices 


and a finite sequence of natural numbers k = k; < ko < ... < Rm, such that 

for all « = 1, 2,...,m, 

(8) 0: = Qu (mod $**"**), 
=~ r; A 

~ Di = QD = Gh a 


where the degrees of the I’, and the A, are m,; and m2 respectively. We will show 
that one can always extend both these sequences by another term. 





ee 


of 
M 


te 


at 


— S 


_— —— c-— 


REPRESENTATIONS OF FINITE GROUPS 349 


Since D,, is an ordinary B-adic integral representation of §, 


Da(x + vy) = D(x) + D,.(y), 
D,. (cx) = Dax), 
Dn (xy) = Du(x)Dn(y), 
for all x, y © §, and all c © ©, and from this one can deduce that 


On (x + y) = Oy(x) + O,(y¥), 
Om(cx) = COm(x), 


Im (XY) = Am(X)Om(¥) + Om (x) Tm (¥), 


for all x, y € © and all c € ©. Therefore @,, is surely a binding system in 
G(A,,, [.., (B*)). Then by Theorem 1, there exists an nm. X m, O-matrix T 
such that 


N0@,,(x) = A, (x) T — T T(x) (mod $**), 
for all x . Then since N = u$**, where u is a unit, we may write 
(10) P""0,,(x) = A,(x) S— ST a(x) (mod $**), 


where S = u~'T is an me X n, D-matrix. Let 


and let 


‘. ) 

Om+1 _ QOnP -_ (J. a & 

where 7,41 = 7, — $**~* S. Then since P = J, (mod f*=~**), we have 

(11) Qnm+1 = On (mod rm". 
Also, setting 


Dmti(X) = QntiDo(x) Omer = P~'(Qn'Do(x)Qn)P = P'Dm(x)P = 
r(x) — PB" A(x)S A(x) 
zo (B0,, (x) _— (Am (x) S— ST,,(x))) — Bro) SA (x)S An(x)+3** ~** SA(x) 


we see, from (10), that the lower left-hand entry of this matrix is divisible by 
$2*=—*») so that we may write 


Pnai(x) A(x) 


~ a 
(12) Dei (x) — a Amii(x) 











350 JEAN-MARIE MARANDA 


for all x € $, where 
Paoi(x) = T(x) — PB" A(x)S 
Amsi(x) = An(x) + BY S A(x) 
Bema i(x) = Br (PO, (x) — (An(x)S — STa(x))) — BP" SA(x)S. 


Setting Rus: = 2(km — ko) = 2km — Zko > Wm — km = Rm we obtain the desired 
result. 
Then, by induction, there is an infinite sequence of m X n D-matrices 


and an infinite sequence of natural numbers k = k; < ke < k3 < ...such that 
for all « = 1,2,3,..., 
(13) 0: = Qu (mod $**~*~**) 
and 

~ a r',(x) AG) ) 
14 D(x) = Q7'Do(x)Q, = ( . 
_ fe) = Ce Del)Oe = \grgice) Adz) 


for all x € , where the I, and the A, are of degree m; and nz respectively. 
From (13), we see that the sequence {Q;} converges, and that if we set 


Q* ™ lim Q:. 
tron 
then Q* is of the form 


I, 
(15) .* (.. tT 1 ) 


and therefore unimodular. Furthermore, if we set 
D* (x) = QF "Do(x)Q* = (lim Qi") Dox) ( (lim Qr) 
tie tia 


lim Q7'Do(x) Q, = lim D,(x), 
t4a0 t4a0 


we see, from (14), that D* is of the form 


a) are) = (7 2) 


for all x € §, where 


r*(x) = lim I(x), A*(x) = lim A,(x). 


tom 


Also, 


D*(x) = O*"Do(x) 0" _ (Te) + BP A(x)T* A(x) ) 


A (x)— B*"*T* A(x) 





REPRESENTATIONS OF FINITE GROUPS 351 


so that 


r(x) = T(x), A*(x) = A(x) (mod B***). 


for all x © §. Setting M = QQ* we obtain the desired result. One can of course 
extend the preceding theorem, by induction, to the case of an arbitrary number 
of constituents. 

It is to be noted that the conditions k > kp in Theorem 2 and k > 2k, in 
Theorem 3 are not necessarily the best possible, and it is possible that refinements 
of these conditions could be found. 


Coro.uary 1. Jf D is an ordinary $-adic integral representation of , then 
for all complete reductions of D, the irreducible constituents are unique up to order 
and unimodular equivalence if and only if for all complete reductions of D, modulo 
f*, wherek > 2ko, the irreducible constituents are unique up to order and unimodular 
equivalence, modulo f*-**. 


Proof. (a) Assume that for all complete reductions of D, the irreducible 
constituents are unique up to order and unimodular equivalence. Let 
r, r 
, . 1 rw * 
D~ Pr; 7 oem ‘s jes ae 
, ’ 
be two complete reductions of D, modulo $*. By Theorem 3, there exist two 
complete reductions of D into ordinary irreducible constituents 
Ai ” Ai ; . 
A» m A» 
D~ vc », ~ Fi taal 
A, A, 
such that 


r(x) = A(x) (@=1, a 


») 
I(x) = Ai(x) (j = 1,2,...,5) (mod B*~**), 


for allx © §. Since for ordinary $-adic integral representations, the number of 
irreducible constituents in any complete reduction is invariant [2, pp. 359-360], 


r = s. Then from the assumption, there is some arrangement &;, ko, _ k, of 
the numbers 1, 2,...,7, such that 
Ar, ~™ 44 oes aes 


Therefore for this arrangement of the indices, 
r,, ~ 1 (mod $*“*) (i = 


(b) Assume that for all complete reductions of D (mod $*) the irreducible 
constituents are unique. up to order and unimodular equivalence, modulo 


Z-. Let 


_ 
~ 


ae. * 


Ai ' Ai , ss 
A» ” A; 
D~ tema , oO~ ay 


A, " AY 











352 JEAN-MARIE MARANDA 


be two complete reductions of D into ordinary irreducible constituents. Then by 


Theorem 3, 
A, A. + | Ai . 
seg fp 2 “s+ Re 


dD ~ 
_ 


must be complete reductions of D into irreducible constituents, modulo §*, 
and from the assumption, there is some arrangement hk, ko,...,k, of the 
indices 1, 2,...,-7, such that 

A., ~ A; (mod $*“**) (é = 1,2,...,7). 


But since A,, and A are ordinary f-adic integral representations, and k — ky > 
ko, by Theorem 2, 


A:, ~ A’ (i = 1,2,...,7). 


Coro.Liary 2. If P does not divide N, and if D is an ordinary Y-adic integral 
representation of , then for all complete reductions of D, the irreducible constituents 
are unique up to order and unimodular equivalence. 


Proof. This is a direct consequence of Corollary 1, since in this case ky = 0 
and we may take k = k — ko = 1, representations, modulo §, being representa- 
tions over a field, the irreducible constituents in all complete reductions are 
unique up to order and unimodular equivalence, modulo §. 


THEOREM 4. Jf D is an ordinary Y-adic integral representation of , and if 


5~6-(") 
. A 


is a unimodular decomposition of D, modulo J*, where k > 2ko, then there is a 
decomposition of D, into ordinary $-adic integral components 


a 
D~2' = (7 ’) 


r(x) = P(x), A*(x) = A(x) (mod $***), 


such that 


for allx € §. 
Proof. Let 


_(r Ba 
m= (5, py 


Then, by Theorem 3, there is a reduction of D into ordinary $-adic integral 


constituents 
r* BA 
7 ( sy 





al 


REPRESENTATIONS OF FINITE GROUPS 353 


such that 
r(x) = T(x), A*(x) = A(x) (mod p" “hey 


for all x € §. By an interchange of rows and columns, which amounts to a 
unimodular transformation, one can always obtain 


A* 
o~ f4 mI 


By a second application of Theorem 3, one obtains the desired result. 


Coro.tiary. If D is an ordinary Y-adic integral representation of $, then for 
all complete decompositions of D, the indecomposable components are unique up to 
order and unimodular equivalence if and only if for all complete decompositions of 
D, modulo Y*, where k > 2ko, the indecomposable components are unique up to 
order and unimodular equivalence, modulo Y*~**. 


The method of proof for this corollary is essentially the same as for Corollary 
1 of Theorem 3. 


TuHeorem 5. If does not divide N, then all $*-modular representations of 
are fully reducible, for any k > 0. 


Proof. Let D be a $*-modular representation of 5, and let 


_ =(* ) 
0 = A 


be a reduction of D, modulo $*. A is a modular binding system in B(T, A, ($*)). 
By Theorem 1, there is an m, X mz O-matrix T such that 


he 


NA(x) = I(x) T — T A(x) (mod $*) 
for all x € §. Since $ does not divide N, N is a P-adic unit, so that 
A(x) = I(x) S— S A(x) (mod $*) 


for allx € §, where S = N-'T is an m, X m2 D-matrix, i.e., 
A(x) = T(x) S — S A(x) 


Transforming Dy by 


one obtains 


This last theorem is evidently also true in the case of -adic integral repre- 
sentations of 9. 

In conclusion, we shall give a counter-example to show that in the case that 
$B divides N, the irreducible constituents in all complete reductions of a -adic 











354 JEAN-MARIE MARANDA 


integral representation of , are not necessarily unique up to order and uni- 
modular equivalence. This counter-example is the same as that considered by 
Diederichsen [2, pp. 373-374] in the case of an arbitrary principal ideal ring. 

Consider the group of all symmetries of the square, whose generators obeys 
the following conditions: 


a‘ = b° = (ab)’ a 


and 


are irreducible and rationally equivalent. The module of all integral matrices 
which commute with these two representations, has only one generator, namely 


ore 1 1 "2 = © 
r=(_1?) (| 7] = 2) 


and it is easily seen that any matrix commuting with these representations, 
modulo any power of 2, say 2*, must be congruent to a multiple of this generator, 
modulo 2*, and consequently must have a determinant which is divisible by 2. 
Therefore, modulo any power of 2, these two representations are not unimodu- 
larly equivalent. 

Now consider the two following representations 


| | ad 1}! 
Di = hing -|- ,b— - = — \ 
| | =i Yon 
1] 1] | | 
and { —] —|] 1 \ | 
— Boil l "= 1/1 = 
| | ' 


These two representations are unimodularly equivalent, for UD, = D.U, where 


= 
2 l i —] 
—|] 2 l | 
2 -3|;-1 -2 
3 3 2 -! 
and none of their irreducible constituents are unimodularly equivalent, modulo 
any power of 2, by the above discussion. 





REPRESENTATIONS OF FINITE GROUPS 355 


REFERENCES 


1. R. Brauer and C. Nesbitt, On the modular representations of groups of finite order, University 
of Toronto Studies, Math. Series, no. 4 (1937). 

2. F. E. Diederichsen, Uber die Ausreduktion ganszahliger Gruppendarstellungen bei arithmeti 
scher Aquivalenz, Abh. Math. Sem. Hansischen Univ., 14 (1938), 357-412 

3. B. L. van der Waerden, Moderne Algebra, 2nd ed. (New York, 1943). 

4. H. Zassenhaus, Neuer Beweis der Endlichkeit der Klassenzahl bei unimodularer Aquivalen: 
endlicher ganzzahliger Substitutionsgruppen, Abh. Math. Sem. Hansischen Univ., 12 
(1938), 276-288. 


Université de Montréal 








NOTE ON THE MODULAR REPRESENTATIONS 
OF SYMMETRIC GROUPS 


HIROSI NAGAO 


1. Let p bea fixed prime number. We denote by &(n) the number of partitions 
_ of n and set 


v 

(1) WA) = ROLE). RO) (x Med, OCA <A), 
p-l 

(2) mA) = dS kA) aa)... ROA-1) (x AL =A, 0<%<d). 
Beccee Ap—: 1 


Recently it was shown by Nakayama and Osima [8] and Robinson [12] 
that the number of ordinary irreducible representations belonging to a p-block 
of weight 8 is equal to /(8). For the number of modular irreducible representations 
Robinson [12] showed that it is independent on the p-core, and using this 
result, Osima [10] proved that it is actually equal to /*(8). In this note we shall 
give a direct computation of this number. 

Now we mention some theorems necessary for the computation without 
proof. For Young’s diagrams [a] and |[a’] of S, and Sy, (n’ <n), we set 
r(a, a’) = ( — 1)’ if [a] contains an (m — n’)-hook of leg length r such that 
{a’] can be obtained from [a! by removing it, otherwise we set 7(a, a’) = 0. 

Denote by x(a; G) the ordinary irreducible character of S, corresponding to 
{a], then Murnaghan-Nakayama’s recurrence rule [7, p. 182; 6; 15] is as follows: 


If G is an element of S, containing a g-cycle P and G is the permutation of 
ua — g letters arising from G by removing this cycle, then 


(3) x(a;G) = > r(a,a’)x(a’;G) 
{a’} 


where |a’| runs over all Young's diagrams of S,—. 


Now let fa] be a p-core' with m nodes and » = m + 8p. Then the number 
of Young's diagrams of S,,,,, with p-core [a] is equal [8; 9; 12; 13] to J(A). 
We denote these diagrams by 


fai"],..., [aii 


In case A > uw > O we set 


.#) ® _Q—#) ey) A—p») 
(4) R; ” = (r (ai » Oy ™), 2, Plato, ay ot | 


Received December 12, 1952. 
‘For the notion of p-cores, see [7], and for the relation between p-cores and p-blocks, see 


(7] and [3]. 





356 








ec 


REPRESENTATIONS OF SYMMETRIC GROUPS 357 


and 
. tA) . = 
(5) RPG) = De rlak’,as™)x(@G)  G=1,2,..., (a — »)) 
Then we have [10, §1] 
THEOREM. 
R?” (G) = 0 when G contains no pp-cycle, 
= aa x(a? ; G) when G contains a pp-cycle, 


where G is the permutation arising from G by removing this cycle and n(G), n(G) 
are the orders of normalizers of G, G in Srp, Sm+c—n)p VeSpectively. 


2. First we shall remark that the following propositions are mutually equiva- 
lent: 


(I) The number of modular irreducible representations belonging to the block of 
weight B with p-core |a°| is equal to [*(8). 


(Il) The rank of the vector module generated by 
+ Te) ee B37 = 1,..., U6 — d)) 
is equal to 1(8) — /*(8). 


(III) The rank of the module consisting of all solutions of the equation 
B WBA) » - 
(6) LD L xy Ry” = 0 


hol jewl 


is equal to 
8 
2 (8 — ») — (8) — (8). 


(1II’) The rank of the module consisting of all solutions of the equation 


4(B—A) 


6 
(7) > Dd xPRP”G) = 0, GeS, 
A=1 


j=l 


is equal to 
8 
> (8 — r) — (8) — *(8)). 
hel 
By Chung [4], Osima [9], and Littlewood [5] it was shown that 


ge a te eer (8 — X)) 


generate the module consisting of all vectors which are orthogonal with every 











358 HIROSI NAGAO 


column of the matrix of decomposition numbers corresponding to the p-block, 
namely, the module consisting of all solutions of the following equations: 


(8) 
2 xs x(a}; V) =0 
D ol 


for every p-regular element V of S,,+5. 
Since the columns of the matrix of decomposition numbers are linearly 


independent, (I) and (II) are equivalent. The equivalence among (II), (III), 
and (III’) is almost evident. 


In the following we shall prove the proposition (II1I’). Let 
(x 
j 


be a solution of (7). If G = P,V in (7) with a Ap-cycle P, and a p-regular 
permutation V of the m — Xp letters not contained in P,, then since A ¥ u 
implies 

RY?” (G) =@ 
we obtain 


i(B—A) 


(8) Dx; RF” (P, V) = 0, 


j=l 


and hence, from the theorem in §1, we have 


(9) > x” x(a?; V) = 0 


ra | 
for all p-regular elements of S,_,). 


If \ = 8 then /(0) = 1 and x™ = 0, and if A < 8 then, from the result of 
Chung [4], Osima [9], and Littlewood [5] mentioned above, it turns out that 


is a linear combination of 
er (u=1,..., B—A;k=1,..., (6 — » —p)). 
Set 
B—Av 1(8—A—sp) . 
*) gp G— 
(10) Gye. > =”. 
pol kel 
Next suppose that A; + Az < 8 and set G = Py Py, V in (7) where no two of 


P,,, Py, and V have common letters. If A; # X2 then 


0 = ; xo) RG) 4 > xo) Ro” (G) 
4 7 





_ —"(G) Or) (BA), r n(G) Oe). 7 B—de), , 
= n(Py, We * x(a; rt. J ) + Py. aes x (a; :P>. V) 





ly 


REPRESENTATIONS OF SYMMETRIC GROUPS 359 


—_ _n(G) = (Ax+@) p(B-A,.#) , 
-v n(P,, Ary dx Ri (P,, ] ) 


n(G) ,'») Re As.” 
T SCP, We dx (Pr, 


“_ ——>;° (xf a 4 xi tAy V0 (a \. Aim—As), -V) = 0. 
z 
Hence, x+***) 4+ xis) = 0 if A, + A» = B, and if A; + A» < B then 


(ast As) (Agid,) 
(x + x” de 


is a linear combination of 


ere g0e1...., B— 1 — Aesj = 1,..., 08 — 1 — As — )). 
We set 
(11) ira, + =), - pa Ym gine se Asa) 
k 

When A; = Ao, by similar arguments as above, we have x®''*») = 0 if 2A, = 8, 
and 
(12) ass ae eS (2d; < 8). 

- & 


Repeating the similar arguments, we have a set of coefficients 


 jemelenmanss ( Omt+...¢4<8:f=1,..., i(6 - sa.) 
1 


which are independent of the order of A;,...,A,-1, and the relations among 
these coefficients: 


(13) (3 2 - a) =0 B= > rw 


(14) 
B—Ai—...—Ae UB—A, Rees) 
Ag. LY (Ay ‘ *. : —Asiheas 
020 ocean) RoMRD NRG SAMMI ST aoe 
| 


Ace y= l k=l 
t 
>A < B, 
l 


A — o8 

where {A,...A;.-..A,} denotes the set of \’s arising from {A,. . . A;} by remov- 

ing Ay, and >,’ indicates the summation over all different {A;...A,.. . Ag}. 
Conversely, it is easily seen from the above arguments that if 


{ () (riads), } 
tXy ; Xe peer 


satisfies the relations (13) and (14), x9 is a solution of (7). 
The propositions (I)—(III’) are true for 8 = 0. We shall now assume that the 











360 HIROSI NAGAO 


propositions have already been shown for all numbers less than 8 and prove 
those for 8 by induction. Then the rank of the module generated by 


ger” (Qu=1,...,8—A;k=1,...,UB—A—-4#)) 


is equal to /(6 — A) — /*(8 — dA). Fix a basis for each A (1 < A < 8) that 
contains R®-*), and set 


(15) ge etemette - 0 
when 
(G—A,- —Ae—:.Ae) 
k 


is not contained in the fixed basis. Then the systems of coefficients which 
satisfy the relations (13), (14), and (15) form a module isomorphic to the 
module of the solutions of (7), and hence it is sufficient to prove that the rank 
of this module is equal to 


8 
> (6 — r) — (8) — *(8)). 


A=1 
LemMA. Define the linear forms f, g, and h in x®---**” as follows: 
(i) fordXi + ...+A, = B we set 


f°~-*"@) - > ’x a, i, Agids) 
(ii) for Ai +... +A, < B we set 


a 
a = an , » ) x" hte» Ay.. —wed. on ent) 
t “ k 
(iii) when 1 CA +... Apa < B— land 
Re? “Ah, Ae—a. As) 
teed 


ts not contained in a fixed basis we set 


(ra..-Aa—ai de) (Ay... Ramat de) 
gj (x) = x; : 


(iv) for R®™ which is not contained in a fixed basis, we set 
(A) ™ 
hy (x) ™= Xe. 


Then the linear independence of f and g, and that of f, g, andh are both equwwalent 
to the propositions (1)—(III’), under the assumption that the propositions (1)—(IIT’) 
are true for all numbers less than B.. 


Proof. We denote by A, B, C, and D the numbers of 


DiereRgaatc) Oic-.Red Gic..Remstde) 2d 
x; Ji » 23 » My 


respectively, and first compute A, B, and C. 





re 


ie 


nl 
’) 


REPRESENTATIONS OF SYMMETRIC GROUPS 361 
(a) Computation of A. The number of 
nea, 
with A, + ... + A, = A is equal to 
j< 
1(B — d) ye RA —u)¢, 
and hence 
8 a 
(16) A= > 2, 1B — r)k(A — 1). 
ma oe 
(b) Computation of B. The number of 
fo ~sAe) 
Jj 
with A, +... + A, = A is equal to /(8 — A)R(A), and hence 
8 
B= > k(A)I(B — 2). 
C= 
(c) Computation of C. The number of 
(Ay... Ama Ae) 
gi 
with Ai +... +Aga =A (1 <A < B — 1) is equal to 
pd \ 
ko} 8 l(pe-rX—yw)-Up-—-aAy+ (ps - A)f, 
pol 
and hence 
g—1 B—r s~1 p—1 
(17) C= > (AB — A — uw) — DRAB — A) + Y RAB — 2) 
A=ml Bool hl h=1 


8 ir-1 s-1 B—1 
= > > ka — w)l(B— dA) — YS R(A(B— A) + DY RAP — 2). 
Awl hel 


From these computations we have 


(18) A — (B+C) 


fe) 
1(8 — 1)k(0) + >> (8 — AYR) — 1(0)k(B) 


b—1 
— >} P(e — rdR(A) 


heel 


ll 


8 
> WB — ») — (l(8) — *(B)). 
A=1 


It is easily seen that 


8 
1(g) — I*(8) = a! (8—d) k(A). 











362 HIROSI NAGAO 


Assume now the linear independence of the linear forms f and -g. Since the 
rank of the module consisting of all solutions of the system of linear equations 


(19) ——— = 0, gs" ~sAg—a Ae) (x) = 0 


coincides with that of the module of solutions of (7), and since f and g are 
linearly independent, the rank is equal to A — (B + C); this proves the pro- 
position (III’) from (18). 

Conversely, suppose that the propositions (I)-(III’) are true. Then from the 
proposition (II), D is equal to 


8 
X16 — r) — (8) — P*(8)). 


It is easily seen that the system of linear equations (19) and h? (x) = 0 has 
only the trivial solution. Since A — (B + C + D) = 0, f, g, and h are linearly 
independent. 

By the lemma and the hypothesis of induction, f, g, and A for weight less 
than £ are linearly independent. We now prove the linear independence of f and 
g for weight 8. 

Let 


(20) > * eeeaae : ee **) (x) + Z. hs i etn | -_ 0 


be a linear relation among f and g. The coefficient of x? in the left-hand side is 
equal to a® and hence a9 = 0. Take a \ (1 < A < B — 1) and set 


ae om oe a5 Be) er = 0 
when {A; ...A,-1} does not contain A. Then 
Diss...) Nits. --Be—a ime) 
oo a: a ee (¢ > 1) 


are transferred to the linear forms, f, g, and h in 


in the case of weight 6 — A, and (20) becomes a linear relation among these 
linear forms. Thus it follows that 


oe Be) =n 0, oo emai me) - 0 


for any \, and 
fpr (x), ’ Ae ernie 


are linearly independent. This proves the propositions (1)—(III’) by the lemma. 


REFERENCES 
1. R. Brauer and C. Nesbitt, On the modular representations of groups of finite order, University 
of Toronto Studies, Math. Ser., 4 (1937). 
2. ———,, On the modular characters of groups, Ann. Math., 42 (1941), 556-590. 


3. R. Brauer and G. de B. Robinson, On a conjecture by Nakayama, Trans. Royal Soc. 
Canada, Sec. III, 40 (1947), 11-25. 


} 





the 


ns 


> is 


1) 


na. 


sity 


0c. 


REPRESENTATIONS OF SYMMETRIC GROUPS 363 


4. J. H. Chung, Modular representations of the symmetric group, Can. J}. Math., 3 (1951), 
309-327. 
5. D. E. Littlewood, Modular representations of symmetric groups, Proc. Royal Soc. London 
(A), 209 (1951), 333-352. 
. F. D. Murnaghan, On the representations of the symmetric group, Amer. J. Math., 59 
(1937), 437-488. 
7. T. Nakayama, Some modular properties of irreducible representations of a symmetric group 
I, II, Jap. J. Math., 17 (1948), 165-184, 277-294. 
. T. Nakayama and M. Osima, Note on blocks of symmetric group, Nagoya Math. J., 2 
(1951), 111-117. 
9. M. Osima, On some character relations of symmetric groups, Okayama Math. J., 1 (1952), 
63-68. 
10. — , Some remarks on the characters of the symmetric groups, Can. J. Math., 5 (1953) 
11. G. de B. Robinson, On the modular representations of the symmetric group, Proc. Nat 
Acad. Sci., 38 (1952). 
12. ———, On a conjecture by J. H. Chung, Can. J. Math. 4 (1952), 373-380. 
13. R. A. Staal, Star diagrams and the symmetric group, Can. J. Math., 2 (1950), 79-92. 
14. R. M. Thrall and G. de B. Robinson, Supplement to a paper by G. de B. Robinson, Amer. J 
Math., 73 (1951), 721-724. 
15. H. Weyl, The classical groups (Princeton, 1939) 


eo 


oo 





Osaka Unwversity 











UNITARY GROUPS GENERATED BY REFLECTIONS 
G. C. SHEPHARD 


1. Introduction. A _ reflection in Euclidean n-dimensional space is a 
particular type of congruent transformation which is of period two and leaves a 
prime (i.e., hyperplane) invariant. Groups generated by a number of these 
reflections have been extensively studied [5, pp. 187-212]. They are of interest 
since, with very few exceptions, the symmetry groups of uniform polytopes are of 
this type. Coxeter has also shown [4] that it is possible, by Wythoff's construction, 
to derive a number of uniform polytopes from any group generated by reflections. 
His discussion of this construction is elegantly illustrated by the use of a graphi- 
cal notation [4, p. 328; 5, p. 84] whereby the properties of the polytopes can 
be read off from a simple graph of nodes, branches, and rings. 

The idea of a reflection may be generalized to unitary space U, [11, p. 82); 
a p-fold reflection is a unitary transformation of finite period p which leaves a 
prime invariant (2.1). The object of this paper is to discuss a particular type of 
unitary group, denoted here by [p g; r]", generated by these reflections. These 
groups are generalizations of the real groups with fundamental regions B,, 
Es, Ex, Es, Tz, Ts, Ts (5, pp. 195, 297]. Associated with each group are a number 
of complex polytopes, some of which are described in §6. In order to facilitate 
the discussion and to emphasise the analogy with the real groups, a graphical 
notation is employed (§3) which reduces to the Coxeter graph if the group is 
real. 

Every polytope II,, whether real or complex, is associated with a configuration 
in projective space P,_,, which may be derived by taking the centre of the 
polytope as origin and then interpreting the coordinates of the vertices as 
homogeneous coordinates in P,_,. (The collineation group associated with the 
configuration is the group that corresponds to the symmetry group of the poly- 
tope [11, p. 84].) Many well-known and interesting configurations are associated 
with the polytopes whose symmetry groups are of the type [p g; r]" such as the 
configuration of 126 points in five dimensions recently investigated by Todd 
(13; 14], Hamill [8], and Hartley [9]. We shall refer to this as the Mitchell-Hamill 
configuration. 

Some of the polytopes discussed are degenerate, that is, analogous to the 
honeycombs of Euclidean space [5, p. 127]. One of these is of particular interest 
since its vertices are the points of a lattice associated with the extreme duode- 
nary form Ky: [7]. 

Throughout the paper the definitions and notation of the author’s Regular 
Complex Polytopes {11] are assumed, but a table of notations is added for refer- 
ence. 


Received December 28, 1951; in revised form January 19, 1953 


364 





UNITARY GROUPS GENERATED BY REFLECTIONS 365 


Symbol | Meaning 

E, | Euclidean space of m dimensions. 

U, | Unitary space of m dimensions [11, p. 82]. 

P,, | Projective space of dimensions. 

Il, Any polytope in E, or U,. 

@® (1,) | The symmetry group of II, [11, p. 84]. 

@,(0,) or G(T) | The group of (orthogonal or unitary) symmetry matrices 
| of II,. 

+ | A polytope whose vertex figure is II, [11, pp. 85, 87]. 

1,*” A polytope whose vertex figure is II,*°-”. 

Gey Buin Ya | The real regular polytopes of £, (see §6 and [2, p. 344]) 

A. v0" | The generalized cross-polytope and orthotope (§6). 

tI, The rth truncation of II, (§6). 

Pil(qi)P2. - -(Gr—1) Pn | The extended Schlafli symbol for a regular polytope 
| [11, p. 88}. 

lp air]™, (Pi qir)™ | See $4. 

(37. « *] See [5, p. 200]. 

lari, Wo} See 2.3. 


I must express my indebtedness to J. A. Todd and H. S. M. Coxeter for their 
advice and suggestions in carrying out the investigations described in this paper. 
[ am especially grateful to the former for undertaking the formidable task of 
checking the abstract definitions in 4.12. 


2. Reflections. Every real non-degenerate uniform polytope II, in £, 
has a symmetry group G(II,) which is generated by at most m elements. In 
@,(Il,), the group of orthogonal symmetry matrices, it is generally possible to 
choose these generators as reflection matrices, that is, matrices whose charac- 
teristic roots are 1 (repeated m — 1 times) and —1. By reducing to diagonal 
form in the usual manner, a reflection matrix may be written S’AS where S 
is orthogonal and A is the matrix diag(1*~', —1). 

The choice of the generators as reflections is not possible in a few anomalous 
cases, of which the most familiar are the two “‘snub’”’ polyhedra [4, pp. 336-337]. 

The idea of a reflection can be extended to U,. A p-fold reflection matrix is 
defined as a matrix of period p which leaves a prime of U, invariant. 


2.1. A p-fold reflection matrix is a unitary matrix with characteristic roots | 
(repeated n — 1 times) and 0, a primitive pth root of unity. 


It may be written in the form §’AS where S is unitary and A is the matrix 
diag (1*~', 6). If the invariant prime has the equation 


n 
> aw, =a'x = 0, 
fol 











366 G. C. SHEPHARD 


then the equation of the p-fold reflection is 
2.2 x* = (I — ba’)x 
where b is chosen so as to make the transformation unitary, and 
b’a = 1 — exp (27i/p). 
Regular complex polygons [11, pp. 89-93] have symmetry groups generated 
by two reflections of this type; the polygon whose extended Schlafli symbol is 


bi(qi)P2 [11, p. 88] has a symmetry group generated by two elements S, T 
corresponding to the matrices: 


S which is a p;-fold reflection permuting the vertices on an edge of the polygon 
cyclically, and 

T which is a p--fold reflection permuting the vertices of a vertex figure of the 
polygon cyclically [cf. 11, p. 90). 

It will be readily verified that all the symmetry groups of complex regular 
polytopes in U, may be generated in a similar manner by n p-fold reflections. 

Suppose that 

Pi = Dae: = 0, p= Do dex, = 0 


t=1 


are two primes of U,. Then we define 


2.3 {Pi, P2} = | abil. 
t= 
This is an invariant under unitary transformations. If {p,, pi} = 1, we say that 


P: is normalized, the normalization being unaffected by multiplying the equation 
P: by any complex number of unit modulus. For two real normalized primes, 
(Pi, P2} is the cosine of the angle between the primes. 


2.4 In U,, the group generated by 2-fold reflections in two normalized primes 
P:, D2 is of finite order if and only if {P., P2} is the cosine of a rational angle, that 
is, a rational multiple of x. Further, if {pi, P2} = cos rh/k where h and k have no 
common divisor, then the order of the group is 2k. 


If the primes are real, the result is familiar, for reflections in two primes 
inclined at an angle hx/k generate a group of order 2k. The proof of the result 
depends upon showing that the statement can be reduced to that of a property 
of real primes in E,, by suitable choice of coordinate system. 

Choose the coordinate system so that p; is x; = 0, and py» is b; x; + bo x2 = 0. 
(To do this we have only to ensure that the intersection p, . pz is x; = x2 = 0.) 
Since the equation of p, may be multiplied by any complex number of unit 
modulus without altering the normalization, we do this in such a manner 
that 5, is real. If we now change the coordinate system by writing 


xi =X (4 = 1,3,4,...,2), 
x: = (b2/ | be | )xo, 








UNITARY GROUPS GENERATED BY REFLECTIONS 367 


then the equations of p; and p, are both real, the matrices 2.2 are real, and the 
theorem follows from the result for E,. 


In order to discuss the group generated by a set of reflections it is convenient 
to introduce a notation for a set of reflecting primes which characterizes their 
geometrical relationships, but is independent of the coordinate system. The 
notation is the graph defined in §3. 


3. Graphs. An elegant graphical notation for groups generated by 
reflections in E, and the associated polytopes was invented by Coxeter [5, p.84]. 
Briefly, the graph for a group consists of a number of nodes and branches (called 
dots and links in [4]) constructed according to the rules: 


3.1 Each reflecting prime is symbolized by a node of the graph. 


3.2 If the angle between two primes is x/k then the corresponding nodes are 
joined by a branch if k > 2, and the branch is labelled “‘k” if k > 4. 


Conventionally branches are not numbered 3 since this type occurs most 
frequently. 


Thus the graph for a finite group in £, has m nodes. It may be disconnected, 
that is to say, consist of two or more separate parts which have no interconnect- 
ing branches, and then the corresponding group is the direct product of the 
groups represented by each part. 

As an example of the graphical notation, the symmetry group of the cube 
is generated by reflections in three planes inclined at angles $7, 47, and }x to 
each other. It is denoted graphically by 
3.3 i ee 

So that the graph corresponding to a given group is uniquely defined, we 
specify that the reflecting primes must bound a fundamental region of the 
group. 

Suppose now that a unitary group is generated by reflections in a number of 
primes. More precisely suppose that each generator is a ~,-fold reflection in 
Pp; (« = 1,2,..., N). If these primes are concurrent it is convenient to take the 
point of concurrency as the origin of the coordinate system. In any case, the 
graph is constructed according to the following rules: 


3.4 Each of the reflecting primes p, is symbolized by a node P, of the graph, 
and this node is labelled “p,” if p; > 2. 


3.5 Each pair of nodes P,P, is connected by a branch labelled 4k, where k is the 
order of the group generated by reflections in p, and p,, except that: 


if {ps Pp,} = 0, the branch is omitted, 
if k = 6, the branch is left unlabelled. 








368 G. C. SHEPHARD 


These conventions are adopted so that 3.4 and 3.5 reduce to rules 3.1 and 3.2 
if all the primes are real. 
As an example, the group generated by reflections in the n primes 


m-fold: x, = 0, 


3.6 
2-fold: x,;—x 1 =090 (4 = 2,3,...,n), 
has the graph 
m 
3.7 Pra + soa (m nodes). 


More generally, the graph of a set of reflecting primes that generate the 
symmetry group of the regular polytope with extended Schlafli symbol 


b1(q1)P2(q2) - - - (Gn—1) Pr 


is a simple chain 


FR Pa F 
: sll ' 
29, Ina 


A unitary group is said to be false if all the matrices can be reduced to ortho- 
gonal form by suitable change of coordinate system. Otherwise the group is a 
true unitary group. Thus the group generated by two reflections in 2.4 is false, 
and the proof of the result depended upon this fact. 


Let Ai, As, ..., A, be m nodes of a graph representing a set of primes. If the 
pairs of nodes A; Ao, Az Az, ..., An—1 Am, Am A: are joined by branches, then the 
nodes A; A: .. . A, are said to form a circuit [4, p. 328]. All finite orthogonal 


groups (and therefore all false unitary groups) have graphs that do not contain 
any circuits [5, p. 297]. A connected graph without any circuits is called a tree 
[4, p. 328]. 


3.8 The graph of a set of primes that generate a true unitary group has either a 
numbered node (that is to say, there is a p-fold reflection with p > 2) or a 
circuit. 


A graph with a numbered node necessarily represents a true unitary group 
since a matrix of type 2.2 with p > 2 cannot be real in any coordinate system. 
Suppose therefore that the group is generated by 2-fold reflections in the primes 
Pi, Po, . - - » Pw, where 


Pi = Dax, =0 G = 1,2,...,8), 
3.9 = 
Pi = Do assy = (@=n+i1,...,N). 


j=l 


For the purposes of the proof we take c; = 0 (all 7). This corresponds to a “‘paral- 








n), 


s). 


he 





UNITARY GROUPS GENERATED BY REFLECTIONS 369 


lel displacement”’ of the prime and this does not affect the proof. Let P, be the 
node of the graph corresponding to the prime p, and suppose that the graph is a 
tree JT. We prove, by induction on the number of nodes that the corresponding 
unitary group is false. 

If there is only one node the group is certainly false, so we assume shat the 
result has been established for the group generated by reflections in r — 1 
primes Pi, Po, . . . , Pr—-1 such that the corresponding nodes P;, Ps, ..., Pps 
and branches form a sub-tree JT, of T. Choose a new coordinate system so that 
P: is the prime x, = 0, the intersection p; - p2 is x; = x2 = 0, and generally 
the intersection Pi: Po: ...- Pp; is x1 = x2 =... = x, = 0. In this system 


where 6,, = 0 (j > 7) and, by the induction hypothesis all the other coefficients 
are real. Regarding these primes as normalized, 


r—1 
bt by, ™ 1 


j=l 
and 


r—1 n 
De bay bes = [Dy dss ae, = {Pi De}. 


Add to the tree 7; another node P, of TJ in such a manner that the nodes 
P,, Pz, ..., P, and the associated branches form another sub-tree of JT. P, is 
connected by one and only one branch to a node of T for if there were more 
than one branch the resultant graph would contain a circuit. Suppose that the 
nodes are numbcred so that P, is joined to P; only. Then if 


Pp, = Dd br5x; = 0 


j=) 


is the normalized equation of the prime, it follows that 


3.10 12 by, bas| = 120 dsar4, 
= — 

3.11 Dd b;b4 =0 (i = 2,3,...,7r—1). 
j=l 


But 3.10 determines 5,; as a real quantity since the only non-vanishing term on 
the left is 5,:b:, and the right-hand side is real. Equations 3.11 determine 
b,2, bys, . . - » Opp) SuCCessively as real quantities, and b,, may be made real 
by multiplying the x, coordinate by a suitable complex number of unit 
modulus. 

Hence the result is established for connected graphs. It follows for disconnec- 
ted graphs since the different parts of the graph are independent, representing 











370 G. C. SHEPHARD 


primes whose intersections are absolutely orthogonal subspaces of U,. The 
theorem is therefore true. 

The converse of the result is false, since a graph with a circuit may represent 
an infinite discrete reflection group in £,. For example the graph 


A 


is P;, a subgroup of index 2 in the symmetry group of the degenerate polyhedron 
- forming the plane honeycomb of hexagons. This example also shows that 
a graph with a circuit may represent two or more different sets of reflecting 
primes. (By the proof of 3.8, a graph with no circuit represents a unique set of 
real primes, within a parallel displacement.) In order to make the correspondence 
between the graphs and sets of primes unique it is necessary to label the circuits 
of the graph according to the rule: 


3.12 If the nodes P;, Ps, ..., P, of a graph form a circuit and P, corresponds to 
the prime p,, where 


Pi = > aux, = 0 eS & Sees. 


then the circuit 1s labelled with the number k where 


r 


r 
3.13 f™ = I] (> a4; &se03) I] {Pi Pir}. 

t=1 j i=l 
Here, for convenience of notation @,,41); = @,, (all 7). In the cases we consider 
in §4, k will be an integer. If k = 1 the circuit will be left unlabelled. 


The meaning of 3.13 is more easily understood if we arrange that aM; 
is real for all 2 except possibly 7 = r, and then 


3.14 > a-; a1, = {p,, pi}. 
J 


Determining the equations of the primes as in the proof of 3.8, it will be seen 
that a graph with nodes, branches, and circuits labelled according to rules 3.4, 
3.5, and 3.12 now represents a set of reflecting primes uniquely (within a parallel 
displacement) and so represents a unitary group completely. It is not, however, 
always possible to find a set of primes corresponding to any given graph. 

If the normalized equations of the primes p, are taken as in 3.9, we write A 
for the matrix (a,,;) and then 


3.15 D = det (A A’) = | det A |’ > 0. 


This is evidently a necessary condition for the set of primes to exist. The deter- 
minant D is called the Schléfli determinant [3, p. 137; 5, pp. 134-135] and its 
importance lies in the fact that if all the reflections are 2-fold it can be written 
down from the graph. If d,, is the (7, 7)th term of D, then 








t 


0 


ts 





UNITARY GROUPS GENERATED BY REFLECTIONS 371 


3.16 dy =1 (all 2), 
0 if the corresponding nodes P,P, of the graph are 
not connected by a branch, 
= cos (x/k) if the corresponding nodes P,P, of the graph are 
connected by a branch labelled ‘k’’. 


a 
< 
i] 


x 
| 


The only other condition is that where the nodes P;, P2, ..., P, form a circuit 
labelled 4, the factor e**” must be put before one of the terms djs, des, . . . , dy 
and the factor e*‘” must be put before the corresponding term with suffixes 
reversed (see equation 3.14). 

By way of an example, consider the graph 


Z\ 


The Schlafli determinant is then 





1 4 — hw 
3 l } 
_ hew* } 1 


and its value is 4, and so the graph satisfies the condition 3.15. 


4. The groups [p g: 7)". Consider the group generated by p+4q+7r 
2-fold reflections: 


p nodes 
ae eee 





en r nodes 
li a eect ee Y ~ 
m a ee ) 





4.1 m 





_ a 





g nodes 


It is denoted by the symbol [/ g; |", noting that when m = 2, the graph becomes 


eo—e— - - - --e 
4.2 > --- —e——e 

oe --- 
so that [p g; 7]? = [3”*’-"] in the notation of [5, p. 200]. This alternative 
notation is useful since it exhibits the symmetry between the numbers #, g, 


r — 1. That is to say, the group is the same whatever the order of the indices. 
In general 


lp q;r)" = (gp; 7)” 


and if m = 3, the p, g, r may be permuted in any way. 
A necessary condition for [p q; r]" to exist is given by the Schlafli determinant. 








372 G. C. SHEPHARD 


Writing @ = e*, 




















D= | 1 @'cost 43] 3 > 0. 
9 cos = 1 } } 
mi = | enol _|4 
; 13 
Ly 
=} | ae | 
ot j | 
rows ' 
- 9] 
$1 | 
—EE — —_ —EE — —— — 
mae 1 4 
rows | ; 1 
| ° > 
; 2 
, 1 
st ea ‘ 
, : 
r—l1 | ” . . 
rows | = 4 
| | h | 








Direct evaluation gives (within a positive factor) 
43 1 — pqa{r + 4cos (x m)—-l}+p+qt+r>0®. 
hen m = 2 this reduces to 
pair -— lI Sptatre+ti, 


which is the condition for group [3”"’-'] to exist in a Euclidean space (3, 
p. 143]. 

When D = 0, the primes are linearly dependent, for this implies (3.15) that 
det A = 0. Two cases arise according to whether we take the primes as con- 
current or not. Apart from an anomalous case mentioned in a footnote to table 
4.4, there are no new groups defined by concurrent primes, and so we take 
their equations as in 3.9 with all the c, zero except ¢,4; = 1 (or any other non- 
zero constant). 

Table 4.4 lists all possible values of ~, g, r and m satisfying 4.3. The table also 
gives suitable sets of reflecting primes (not necessarily in the smallest possible 
number of dimensions) and the order of each of the groups (computed from the 
abstract definitions in table 4.12). 

By applying the rules 4.5-4.9, the abstract definition of each of these 
groups may be written down from its graph. The definitions are given in full 
in table 4.12. 








UNITARY GROUPS GENERATED BY REFLECTIONS 373 


4.4 Table of groups |p q; r|" 














m group | reflecting primes | order | 
} ee ee ms as a 
' 2 | [Zana] = B | (ae (b) | 21.9! 
[3'22] = Fy | (ade (c) | 72.6! 
[31-23] = BR, | (a)z, (c) 8.9! 
| [3"24] = Ky (a)s, (c) 192.10! 
[3222] = 7; (a)s, (c)’, (d) ow 
i [3'33]) = 7, (a)s, (e) « 
[31+2-5] a >, | (a)», (c)’ a 
7 et eee ere 
3 {1 1; » — 2} (a), (f) | 3*1n! 
[2 1; 2}? (a)s, (f), (g) |. 72.6! 
2 1; 3}? | (a)s, (f), (g) | 108.9! 
2 1; 4}* (a)s, (f), (g)’ | © 
C2 gee oem pe : 
Zz | {1 1; » — 2} | (a),, (h) 4*-' n! 
| [2 1; 1} | (a)s, (h), (k) 64.5! 
(2 1; 2}¢ | (a)s, (h), (k)’ oo 
| | (3 1; 1} | (a)s, (h), (k), (1)! | ow 
j | m | {1 1; » — 2]™ | (a),, (m) | mn! 
ne — _— ak 
(a), Xi —~Xv_1 = 0 (¢ = 2, 3, ia SD, 
(b) x:+x, = 0, 
(c) 2(x, + x2 + x3) — (%4 + X53 + X65 + X71 + Xs + Xe) = O, 
(c)’ 2x. + x2 + Xs) — (x4 + Xs + X6 + X17 + Xs +X) = |, 
{ (d) (x1 + x2 + x3 + x4 + Xs + X65) — 2(x7 + Xs + xy) = 0, 
(e) (x41 + X2 + x3 + 2X4) — (X5 + X6 + X72 + Xs) = 1, 
(f) x1 — wx. = 0 (w a primitive cube root of unity), 
(g) Xy + Mo + Xs + X45 + X5 + Xe = *O,7 
(g)’ Xi + Xo + X35 + Xe + X5 + Xe = 1, 
(h) x, — ixe = 0, 
{ (k) x1 + x2 +2x3+ x, = 0, 
(k)’ x1 + X%2 + x3 + x, = 1, 
| () x =1, 
| (m) x; — 0x, = 0 (@ a primitive mth root of unity) 
Denote the nodes of the graph by P;, Ps, ..., P,, the node P, being labelled 
p,, that is, corresponding to a p,-fold reflection. Let P,; denote the operation in 
| 'If the primes are concurrent, that is, we take x, = 0 instead of (1), then this is a finite group 


| of order 64.6!. It is the symmetry group of the polytope (47s*)*' described later (6.13) 








374 G. C. SHEPHARD 


® corresponding to reflection in the prime p,, and let E be the identity of G. 
Then @ is generated by P:, Ps, . . . , P, subject to the relations: 


4.5 (P)™* = 1 (¢ = 1,2,... 
4.6 P.P, = P,P; for every pair of nodes P,, P; not connected by a branch. 


4.7 (P.P;)* = 1 for every pair of nodes P,, P; connected by a branch labelled k, 
and with p; = p; = 2. 


, nm). 


4.8 A relation is required connecting P,P, when p, or p, is not 2. This cannot 
. be read off from the graph immediately, but the required relation for 


is (P, P,;)? = (P, P,)?. 


4.9 If P., Ps, ..., P. form a circuit, one further relation is required connecting 
the operations P,, P2,..., P,. In the case of a circuit of the form 
A 
eed 5 


a suitable relation is (P2 P; P;)? = (P; P; P:»)?. 


Table 4.12 gives the abstract definitions of the groups @(8",) (see §6) and the 
finite groups [p g; r]"; 4.10 and 4.11 indicate the method of labelling the nodes. 
Abstract definitions for the groups [3?:*"] are given in [3, pp. 144-151]. 























m 
o— 7 e—__e— - - - —e# 0 
4.10 P™Q Qa Qh2 Qn 
 -. 2 2 
R, R, Rey R, 
4.11 ms i i nae 
o——e -- - —e—_—_ 
Qg Qeu Q, Q, 
4.12 Table of abstract definitions. 
ERS ae LER 
group generators | relations 
O(y3 P, Q: | (a) 
G(y") a 01, Q:2, re | 02, 1 (a), (b) | 
GL yp) = (11; 1)" Pi, Qi, Ri (C)m 
G(E ye) = [1 1; n—2]" Fas te ee Bos R,-2 (C)m, (d) 
[2 1; 2} P2, P's 01, Ri, R; (c)s, (e) 
[3 1; 2)" P3, P»2, P,, 01, R,, R; | (c)s, (e), (f) 
(2 1; 1}* | 





P,, P,, 01, R 


(c)., (g) 











UNITARY GROUPS GENERATED BY REFLECTIONS 375 


(a) P™ = Q,*? = 1; (PQ:)? = (Q:P)?, 

(b) QO2 = (Q:Qv-1)? = 1 (¢ = 2,3,. n — 1) 
PQ, = O.P (: = 2, 3, n — 1) 
(QQ)? =1 (i,j = 1,2,...,"8—1;|i—j| > 2), 


(c)m Pi? = Qi? = Ri? = (QiRi)* = (RiPi)* = (Pi Qi)” = 1; 
(P; 1 R;)? = (R,; P,; Q;)’, 





(d) Re = (Pi Ry* = (G1 Rd)* = (Ri Rei)* = 1 (i = 2,3,...,” — 2); 
R,R, = RyRy (i,j = 1,2,...,"—2;|4 —j| > 2), 
(e) P= R.* = (P2P;)* = (P2Q:)* = (P2Ri)? = (P2R2)* = (PiR2)* = (QiR:)? 
= (R,R;)* = 1, 

(f) Ps? = (PsP2)* = (PsP1)* = (P:Q:)* = (P:R)? = (P:R)? = 1, 


(g) P? (P2P;)* _ (P:Q:)? _ (P2R;)* = 1. 


The above definitions have been checked by the Todd-Coxeter method [15]. 
In the case of the larger groups the work can be simplified by considering a 
polytope II, whose symmetry group is being examined, and taking as the 
generating subgroup, the symmetry group of one of the vertex figures of II,. 
The vertices of I, are then in 1-1 correspondence with the cosets of this sub- 
group, and the work in the coset tables can be continually checked. 

I am indebted to J. A. Todd for the following remark about the group 
[3 1; 2]*. The given relations (c)3, (e), (f) imply 


(PsP2P:QR:R:)* = 1. 


If, however, we postulate (P;P2P,;QR,R:)’ = 1, the resulting factor group is of 
order 18 - 9!, being the collineation group [11, p. 84] corresponding to [3 1; 2]*, 
viz, the group of the Mitchell-Hamill configuration in five dimensions [8]. 


5. Graphs for polytopes. In order to represent a polytope graphically we 
add to the graph of its symmetry group one or more rings round the nodes 
[4, p. 329]. Of particular interest are the polytopes denoted by a graph with 
only one ring, and we define these in 5.1. 

First, we suppose that the reflecting primes lie in space of m dimensions, are 
n in number and are linearly independent. Let O be the point of concurrency 
of the primes. Define G, (i = 1, 2,..,) as being any point at unit distance 
from O and lying on the line of intersection of pi, Po, .. . , Pes, Dasa, - - - + Da» 
and let P, be the node corresponding to p,. 


5.1 If the node P, of the graph is ringed, then the new graph represents the 
polytope of which one vertex is G, and the other vertices are the images of G, under 
the operations of the group. 


For example, 3.3 represents a cube if the left node is ringed, and an octahedron 
if the right node is ringed. 

If, on the other hand, the group is generated by reflections in m + 1 non- 
concurrent primes, then G, is defined as the point of intersection of all the primes 











376 G. C. SHEPHARD 


except the ith, and rule 5.1 still applies. Only in these two cases will the graphical 
notation for a polytope be employed. 

So far, the choice of reflecting primes for a given unitary group has not been 
restricted in any way, so that a number of different graphs may correspond to 
the same group. It is now convenient to discuss some of the restrictions that 
may be imposed, so that the graph of the polytope has a number of additional 
properties. Select the primes so that: 


5.2 Reflections in Py, P2, .--, DP» (07 Pi, Po, - - - , Pri) generate the group. 


5.3 The points G,, Go, ..., Gy (or Gi, Go, .. . , Gags) are not equivalent, that 
is lo say, one cannot be transformed into another by an operation of the group. 


5.4 If there is more than one set of primes satisfying 5.3, choose that set which 
makes as many as possible of the polytopes given by ringing one node different. 


5.5 The image of G, under reflection in p, is at least as near G, as any other 
point equivalent to G,;. Thus the vertex G, of the polytope with the ith node 
ringed is transformed by reflection in p; into a point of its vertex figure. 


The primes denoted by the graphs 4.1 have all these properties, and con- 
versely for a group of this type, the set of primes satisfying 5.2, 5.3, 5.4, and 5.5 
is unique, within an operation of the group. There is reason to suppose that 
selection according to the above rules is possible for any finite n-dimensional 
unitary group generated by reflections, or any discrete infinite group generated 
by m + 1 reflections in non-concurrent primes, but this has not been proved. 

With this choice of primes the rule given by Coxeter [5, p. 198] for obtaining 
the graph of the vertex figure of a polytope still holds: 


5.6 If the ringed node belongs to only one branch, we obtain the vertex figure by 
removing that node (along with its branch) and transferring the ring to the node to 
which that branch was connected. 


For example the vertex figure of 6.7 is 6.9 and the vertex figure of 6.9 is 


If the polytope is a polygon, application of this rule leaves us with a single 
ringed node, labelled “‘p’’ say. This is to be interpreted as a p-line [11, p. 85). 

5.8 In order to determine the number of vertices of any given polytope Il,,, consider 
the group @* which corresponds to the graph formed by removing the ringed node 
(and any branches connected to it) from the graph of ll,. The number of vertices 
ts then the quotient of the order of @(Tl,) by the order of @* [cf. 4, p. 329} 

For example, in the case of the polytope 6.10, the group * has the graph 


o—__e—__e -# 


ad 





UNITARY GROUPS GENERATED BY REFLECTIONS 377 


This is the symmetry group of the regular simplex ay, and so is of order 5!. 
The symmetry group of 6.10 is [2 1; 2]* of order 72.6!, and so the number of 
vertices is 72.6!/5! = 432. 

Some of the bounding figures of a complex polytope may be determined 
from its graph in the same manner as for a real polytope [4, p. 334]. There are, 
however, other bounding figures not given by this procedure. A simple example 
is the icosahedron 2(6)2(10)2, which, in addition to twenty triangles contains 
twelve pentagons of the same edge length. These are not counted as bounding 
figures of the real polyhedron since they lie inside the figure. There is no distinc- 
tion between interior and exterior of a complex polyhedron [11, p. 83] and 
so in this case these pentagons (which are not given by the graph) must be 
included. 

In order to facilitate reference, it is convenient to define a symbol for the 
polytope whose graph is given by ringing one of the nodes of the graph of 
[p g; r]™. Referring to 4.11, if P, is ringed, the polytope is denoted by (; q; rT)” 
and similarly, suffixes are added to the g or r if nodes Q, or R, are ringed 
[cf. 4, p. 331]. For example, (2 1; 3;)*, which is the same as (3; 2; 1)’, is the poly- 
tope represented by graph 6.7. 


6. Fractional y polytopes. In m dimensions (m > 4) there are three real 
regular non-degenerate polytopes [2, p. 344]. These are the regular simplex, 
the cross polytope, and the measure polytope or orthotope. They are denoted 
by a, Ba, and y, respectively. In U, (m > 4), in addition to the simplex there 
are two series of regular polytopes: the generalized cross polytopes and the 
generalized orthotopes [11, p. 96]. The first of these, which is denoted by §",, 
has mn vertices: 


oe. ee of 


in the abbreviated notation. (The pre-suffix m implies that the 1 may be multi- 
plied by any mth root of unity, and the prime means that the coordinates are 
to be permuted in every way. For a fuller explanation see [11, p. 96].) The exten- 
ded Schlafli symbol for this polytope is 2(6)2(6) . . . (6)2(2m*)m. 

The reciprocal polytope is the generalized orthotope denoted by y™, and 
first described by Coxeter [4a, p. 287]. It has m” vertices: 


6.1 (e*", eo”, Pe g**) 


where ki, ke, ..., &, take any integral values and @ is a primitive mth root of 
unity. The symmetry groups of 8”, and y”, are identical, of order m"*.n!, and 
their abstract definitions are given as @(8",) in table 4.12. The group is gene- 
rated by reflections in the primes 3.6 and so may be represented by the graph 
3.7. The polytope 8", is denoted by ringing the node furthest to the right, and 
y™, is denoted by ringing the node on the left. 

Polytopes corresponding to the same graph but with any other node ringed 
are what may be called truncations of 8", or y",. If the node Q, (see 4.10) is 











378 G. C. SHEPHARD 


ringed we may denote the resulting polytope by t,y™, Or f,;-16", by analogy 
with the corresponding notation in the theory of real polytopes [2, p. 354; 5, 
pp. 145-148]. The vertices of this polytope are 


6.2 Lol ale. - +. at OO... 


with p terms zero and nm coordinates in all. They are the centres of the a,_,-1 
that bound §", or of the y", that bound y”,. 

Evidently 8*, and y*, are the real polytopes 8, and y, respectively, so that we 
conventionally omit the superscript if its value is 2. 
The eight vertices of 3; may be divided into two sets of four such that each 
set consists of the vertices of a regular tetrahedron a3. The same process may 
be applied to real y polytopes of higher dimension, and we write 47, (which is the 
hy, of Coxeter [2, p. 362]) for the polytope whose vertices are the “alternate” 
vertices of ,; that is, we select half the vertices of y, in such a manner that no 
two are joined by an edge of y,. For example $7, = 8, and in general in the 
notation of Coxeter [4, p. 331; 2, p. 372], 


$n = Lie—ai = (Li 1 (nm — 3)). 

In a similar manner we can select a subset of the vertices of y", so that the 
points of this subset are equivalent. Writing the coordinates of the vertices as 
in 6.1, instead of allowing &;, ks, .. . , k, to take any integral values we consider 
the points for which 

ym k, =0 (modm). 
There are precisely m"~' such vertices and there are m similar subsets in all, 
given by the m congruence classes of }>k,; modulo m. Taking these m*-' points 
as vertices, and lines joining pairs of vertices whose distance apart is +/2 as 
edges we obtain a polytope which will be denoted by 4y",. 

For example, }y*; has nine vertices: 


(co**, ow", w**) 
where w* = 1, k; + ke + k; = 0 (mod 3), or 
efi, 1, 1), (1,00, 0%)’. 


These nine points are the vertices of 8*;, and may be reduced to the more 
familiar form (31, 0, 0)’ [11, p. 96] by the transformation 


* = 4) : I x. 


l w @w 


In general 1", will not be regular, but there are four exceptional cases: 


$73 = a3, hy’: = 8°, 
6.3 m 
dy, = Bs, ry $= {m}, 


where {m} is the real regular m-gon. 





| 
| 








cr 


Sg 


UNITARY GROUPS GENERATED BY REFLECTIONS 379 

The symmetry group of 4", is of order m"~' . n! and is generated by reflections 
in ™ primes, 

6.4 2-fold: = Ox- = 0 (@” = 1), 

2-fold: Xe ~~ Xe1 = 0 (2 = 2,3 peers n), 


from which we deduce that it is the group [1 1; » — 2)". Hence 4y", is the 
polytope (1, 1; » — 2)" and its graph is 


> 


with m nodes. The polytopes corresponding to the same graph but with other 
nodes ringed are identical with the truncations of y", or §”,, with vertices 6.2. 
This may be stated in the form of a rule: 


6.5 A polytope is unaltered if its graph is changed by replacing 


o> or [>= 


by oy" o——_0——0— - - - or o> —_@—-_ --- 
respectwely. 


When m = 2 the rule is already known [4, p. 333]. It is only the polytope 
which is unchanged by the above replacements; the order of the group is 
increased by a factor m. 

Another polytope whose vertices are a selection of the vertices of a y polytope 
is that which we shall denote by }$y‘,. It has 2.4"~' vertices: 


gt TT oe * > &, = 0 (mod 2), 


where «? = —1. The symmetry group of this polytope is not generated by 
reflections (see the note to table 4.4), and so cannot be symbolized graphically. 
We shall refer to this polytope later as the vertex figure of a lattice. 


Polytopes associated with the groups [p g; r]™ may be derived from the poly- 
topes Ly", by using the latter as their vertex figures. In fact, 


(pp 1; — 2)" = (Ay"s)” 


For example, (22 1; 3)*, with graph 


“ 











380 G. C. SHEPHARD 


has, as its vertex figure (see 5.6), the polytope (1, 1; 3)? = 4y*s, and so we may 
write 


(22 1;3)* = (dy's)*”. 


It has 4032 vertices (by 5.8), whose coordinates, in the abbreviated notation, 
are 
(3, 0, 0, 0, 0, 0)’; + A(sl, 31, 31, 0,0, 0)’; 

6 ks ‘ 
6 s(2, —~". —~", —w", —w", —w**) > k, =0 (mod 3), 
where A = 1 —w. The edge length is 1/6 and the vertex distance is 3. The related 
projective configuration (in which the coordinates of the vertices are interpreted 
as homogeneous coordinates in Ps) consists of the 672 points known as the 
H-points (the vertices of 112 a-hexahedra) of the Mitchell-Hamill configuration 
(8; 9; 13; 14). 

Another polytope having the same symmetry group is (33 1: 2)* or (2 1; 3) 
or (4y*,)**, with the graph 


6.7 \3/ 


It has 756 vertices, 


+ A(sl, — 31, 0, 0,0, 0)’: 
6.8 


+ (ow, w, w'*, w**, w"*, w"*), ; Be k, = 0 (mod 3), 
and its edge length and vertex distance are both /6. The related projective 
configuration consists of the 126 centres of homologies of the Mitchell-Hamill 
configuration. 

Polytopes symbolized by ringing other single nodes of the graph have the 
same symmetry group, [3 1; 2]* or [2 1; 3]°, so that the related projective 
configuration will again be connected with the Mitchell-Hamill configuration. 


For example, (2 1; 3)? or 
3 ; 


+ (33, 33,0,0,0,0)’; + A(s2, — sl, — 21, 0,0, 0)’; 


has 30,240 vertices: 


zk k Ks k ks Ke 
+ A(w',w’,w’, —w*, —w’, —w'*)’, 


+ (2w**, 2eo**, 2u**, Qu**, —w"*, —w"*)’, > k, = 0 (mod 3). 


+ (2 — ww, (2 — wa, w", w"*, wo", w*)’, | 


Its edge length and circumradius are 34/2. The related projective configuration 
consists of 5040 points lving by threes on the 1680 «-lines [8, p. 403] of the 


Mi 
the 


ay 


—ee | ee... ee 


UNITARY GROUPS GENERATED BY REFLECTIONS 381 


Mitchell-Hamill configuration. Each point is the harmonic conjugate of one of 
the 126 points with respect to the other two that lie on the «-line. 


The symmetry group of (2 1; 2,)* is [2 1; 2]* of order 51,840. Its graph is 


6.9 \y/ 


and so it may be written (4y*,)*'. It has 80 vertices, the points 6.8 that lie on 
x, = 3. Evidently the points 6.6 or 6.8 lying on primes parallel to this will 
have the same symmetry group. In particular (2 1,; 2)* or 


6.10 \y 


has 432 vertices, the points 6.6 on Sox; = 3. 

The collineation group corresponding to [2 1; 2]* is the simple group of order 
25,920 which is familiar as the collineation group of the Baker configuration 
[1; 12]. Consequently the related projective configurations are associated with 
the Baker figure. For example the intersections of the 40 «x-lines through any 
point of the Mitchell-Hamill configuration meet the prime polar to that point 
[8, p. 402] in 40 points forming the projective figure related to (2 1; 2.)*. 


The only other group with m = 3 to be discussed is the degenerate group 
generated by reflections in m + 1 primes, [2 1; 4]*. The degenerate polytope 


(2 1; 44)* or (4y*4)*? or 
\ 3/ 


is of exceptional interest since its vertices form the lattice associated with the 
extreme duodenary form Ky: of [7]. The simplest way of exhibiting the vertices 
(discovered by Todd and Coxeter) is 


6.11 (x1, X2, Xs, X4, Xs, Xs) 


where the x, are integers of the field R(w) mutually congruent modulo \, and 
whose sum is congruent to zero modulo 3. 
The polytope (22 1; 4)* or ($y*s)*! with graph 





‘e—_e . ° o- 
yw 
has vertices which do not form a lattice but are a subset of the points 6.11. 
The coordinate vectors of the vertices are the aggregate of vectors 
a+ Ab 


where b is any vector of the set 6.11 and a may be any one of 











382 G. C. SHEPHARD 


(0, 0, 0, 0, 0, 0); A(al, al, 31, 0, 0, 0)’, 


) 


ks ks ks Re ks Re 
(w',w',w",w',w',w'), 


La &k k k Re Ke 
(2w"*, 2w"*, —w', —w*, —w', —w'’)’, | 


Now consider polytopes associated with the groups [p q; r]*. The only finite 
group is [2 1; 1]*, and the polytope (2, 1; 1)‘ has the graph 


* 4[> 


It is (4y43)* and has 80 vertices: 


> &: = 0 (mod 3). 


(2, 0, 0, 0)’; 
(at, a, a, a), i k, = 0 (mod 4). 


The edge length and vertex distance are both 2. The related projective con- 
figuration consists of 20 points in three dimensions which form the vertices of 
five tetrahedra selected out of the 15 tetrah :dra that form the Klein configuration 
{10, p. 48], the selection being made in such a manner that no three of the five 
belong to a desmic system. 

The 60 vertices of the complete Klein configuration are related to the 240 
vertices of the polytope (47‘;3)*!: 


6.12 


«(2, 0, 0, 0)’; 
6.13 (4(1 + 2), (1 + 2), 0, 0)’; 
ge a > k, = 0 (mod 2). 


The symmetry group of this figure is of order 46,080. The corresponding real 
8-dimensional polytope is (PA), or 42; [2, p. 385; 5, pp. 201, 204]. 
The polytope (47‘;)*? is degenerate with vertices 


6.14 (x1, Xe, X3, X4) 


where the x, are integers of the field R(z) mutually congruent modulo (1 — 4) 
and whose sum is congruent to zero modulo 2. The vertices form a lattice. 
(Both this and the lattice (4*,)*+* bear a remarkable resemblance to the lattices 
351, 521, and 222 of [6, pp. 420—-421].) 

The degenerate polytope (2. 1; 2)*, (}y*s)*’, or 


* 4 >— 


has vertices with coordinate vectors 


6.15 a+ (1 — i)b 


ye 





wT a SA 


UNITARY GROUPS GENERATED BY REFLECTIONS 383 


where b is any coordinate vector of type 6.14 and a is any vector 


(0,0,0,0), (a, a**, i**, i**), > ky = 0 (mod 4) 


The degenerate polytope (3; 1; 1)‘, (4y‘s)**, or 


a 


has coordinates of type 6.15 with b of type 6.14 and a any vector of the set 
6.12. 


REFERENCES 


1. H. F. Baker, A locus with 25,920 linear self-transformations (Cambridge, 1936) 
2. di. S. M. Coxeter, The polytopes with regular-prismatic vertex figures, Phil. Trans. Royal 
Soc., Ser. A, 229 (1930), 329-425. 


3. — -, The polytopes with regular-prismatic vertex figures I1, Proc. London Math. Soc. (2), 
34 (1932), 126-189. 

4. ———, Wythoff's construction for uniform polytopes, Proc. London Math. Soc. (2), 38 
(1935), 327-339. 

4a. ———, The abstract groups R™ = S™ = (R’S’)?; = 1, S™ = T? = (S’T)**; = 1, and 
S™ = T? = (S~TS’T)?: = 1, Proc. London Math. Soc. (2), 41 (1936), 278-301 

5. -, Regular polytopes (London, 1948; New York, 1949). 

6. ———,, Extreme forms, Can. J. Math., 3 (1951), 391-441. 


7. H. S. M. Coxeter and J. A. Todd, An extreme duodenary form, Can. J. Math., 6 (1953), 
384-392. 

8. C. M. Hamill, On a finite group of order 6,531,840, Proc. London Math. Soc. (2), 52 
(1951), 401-454. 

9. E. M. Hartley, A sextic primal in five dimensions, Proc. Cambridge Phil. Soc., 46 (1950), 
91-105. 

10. R. W. H. T. Hudson, Kummer’s Quartic Surface (Cambridge, 1905). 

11. G. C. Shephard, Regular Complex Polytopes, Proc. London Math. Soc. (3), 2 (1952), 
82-97. 

12. J. A. Todd, On the simple group of order 25,920, Proc. Royal Soc. London, Ser. A, 189 
(1947), 326-358. 

, The characters of a collineation group in five dimensions, Proc. Royal Soc. London, 
Ser. A, 200 (1949) 320-336. 

14. ———, The invariants of a finite collineation group in five dimensions, Proc. Cambridge 
Phil. Soc., 46 (1950), 73-90. 

15. J. A. Todd and H. S. M. Coxeter, A practical method for enumerating cosets of a finite 
abstract group, Proc. Edinburgh Math. Soc. (2), & (1936), 26-36 


13. 





University of Birmingham 











AN EXTREME DUODENARY FORM 
H. S. M. COXETER anp J. A. TODD 


1. Introduction. Let f(x,...,x,) be a positive definite quadratic form 
of determinant A; let M be its minimum value for integers x, ..., X,, not all 
zero; and let 2s be the number of times this minimum is attained, i.e., the number 
of solutions of the Diophantine equation 


f (x1, ...,%_) = M. 


The form is said to be extreme if, for all infinitesimal variations of the coefficients, 
A/M" is minimum. A form for which this minimum is as small as possible is 
said to be absolutely extreme. For a list of the known extreme forms with n < 12 
(including the absolutely extreme forms A», A;, Ds, Ds, Es, Ex, Es), see Coxeter 
[1, pp. 394, 439]. Continuing this list, we may say that there are three known 
extreme forms for m = 12: 


1.1 
2 2 2 2 2 
Ay =X. — M1Xe t+... HX — XoXo + X10 — Ki0Xu + Xn — XX + X12, 
1.2 
2 2 2 2 2 
Dy = 1 — M1 Xe +... +X — X9X10 + X10 — X10X1 + X11 — X10X12 + X12, 
1.3 
2 2 2 ie 2 3. 2 
Di2 = X%1 — Mi Xe t+... +X — XoXi0 + X10 — XeXu $+ Xn — XuXw + He. 


Apart from a numerical factor, these are equivalent to the forms U1:, Vie, 
W 2 of Korkine and Zolotareff [5, p. 367]. Since 2D?, has integral coefficients and 
determinant 1, but cannot take the value 1 (for integers x;,..., X12), it must 
also be equivalent to the form fi: of Chao Ko [4, p. 85]. When giving the number 
of automorphs as 2'° 12! instead of 2"! 12! [1, p. 434], Ko was doubtless omitting 
the automorphs of negative determinant. 

In a letter to one of us, dated January 14, 1947, T. W. Chaundy announced 
the form 


1.4 Ju = (4x, + x2 + x3)° + (4x1 + x2 + x)" 
+ (3x1 + x3 + xX + X53 + Xs)" + ($41 + 41 + x8 + x9) 


. ll 
+ > (3x1 + x,+ Axi)” + 7. (x; + 4x12)* 
5 9 


+ (x10 + xu + Aye)” 


Received October 15, 1952. 
384 


st 


Z 





AN EXTREME DUODENARY FORM 385 


as a possible candidate for the title 
still smaller for the new form 


‘absolutely extreme.'’ However, A/M" is 


5 5 2 
15 Ku = M4 > (6x, + 3y, + 2ye)? + 3( 6, +2> y,- ys) 


. i 1 


3 | 
+3209, + ef, 


which we shall derive from a lattice in unitary 6-space, somewhat resembling 
the lattice in unitary 3-space that represents the senary form E, [1, p. 421]. 
We shall prove that Ky. is extreme, but we still do not know whether it is 
absolutely extreme. 

The following table shows how these five forms compare with one another: 


s M 2!24 (2/M)"*4 | 
Ax 78 l 13 13 | 
Dis 132 { ' 

D?, 132 | 
ya 324 2 2" i | 
Ki 378 2 3° ioe 


2 
| 709 





The value of s for Ky: indicates that equal ‘‘solid’’ spheres in Euclidean 12-space 
can be packed in such a way that each touches 756 others. 


2. Eutactic stars. In Euclidean n-space, a set of 2s vectors + a,..., +a, 
is called a eutactic star if the sum of the squares of the orthogonal projections of 
@:,...,@, on a line is the same in all directions [1, p. 401], i.e., if there is a 


constant o such that 
s 
> (a:.z)* = o2’, 
lal 
for every vector Z. In terms of rectangular Cartesian coordinates, we write 


@,= (wis---, Hat), Z= (Sy~-- 5 Sn) 


and the condition becomes 


ye D Des Mer $3 ft =o) ft), 


> By Bei = 05» 
i=l 


[1, p. 397, with p, = 1/¢ and subscripts used instead of superscripts]. 











386 H. S. M. COXETER AND J. A. TODD 


Let us now extend this notion to a unitary n-space (in which the distance 
between two points is the square root of the sum of the norms of the differences 
of their coordinates {,). The above “orthogonality’’ relation suggests that it 
would be appropriate to define a eutactic star 


= (12, ‘ey Mni); 


in such a space, by the “Hermitian”’ relation 
s 

2.1 } By Mei = 08 x. 
i=1 


If we write w,;, = £5: + ;,1, where the é and » are real, this becomes 
/ (Ey, = nyt) (Ex: + mit) = 05 x, 
and the real part yields 
ps (Es, fer + nj: Mer) = Cp. 
Hence 
2.2 When the real and imaginary parts of the coordinates in unitary n-space are 
interpreted as coordinates in real Euclidean 2n-space, a eutactic star remains 
eutactic. 
3. The lattice ZL. As usual, we write w = e***/* and 


2 


3.1 A=1l-—-—w= — 2w-w. 


In unitary space of six dimensions, let L denote the set of points whose co- 
ordinates ({:, ..., 6) are integers of the Eisenstein field R(w), mutually 
congruent modulo \ and adding up to a multiple of 3, so that 


ft, =k (mod d) se fy ee |S 
Xt, =0 (mod 3). 


3.2 


The corresponding vectors 
(f1, « *- » £6) - > fy P; 


have the property that the difference of any two belongs to the set; hence L is a 


lattice. 
Since the coordinates are Eisenstein integers, we may write 


f;=A,ot+ Bw, 

where A, and B, are rational integers. The conditions 3.2 are equivalent to 
A,;+ B,=A,+ B, (mod 3), 
L4A,= 2 B,=0 (mod 3). 


3.3 





AN EXTREME DUODENARY FORM 387 


Analogy with £, [1, p. 421] suggests the following theorem: 
3.4 The twelve complex vectors 
t, = 3p, (j=1,...,5), te = — 3drpe, 
= A(Pr-s — Ds) (k=7,..., 11), 
tie = Pit... + Ds 


form a rational integral basis for the lattice L. 


_ 
7 
| 


Proof. Consider the vector 


6 6 
zZ= 2d xt, + De ys tse 


where x, and y, are rational integers. In terms of the orthogonal unit vectors 
Pi, ores Ps, 


5 6 
z= » (8x, + Ayy + ¥e)Py + [. 3Axe — AD ¥s+ ve) Pe 


If this is the same as >> (A,w + B,w*)p,, we have, by 3.1, 


Aw + By” = — (20 + w)y, — (w + w’) (3x, + ye) G=1, »D)s 
Aw + Byo® = (2w + w')(3xs + y1 +... + ys) — (w+ w)¥e, 
whence 
A, = — 2y,;— 3x; — ye, 
By= — ¥3— 3x;— Ye (j= 1,...,5), 
Ac = 6x6 + 2(y1 +... + Ys) — Ye, 
Be= 3x6+ y+... +95 — Me, 
and 
x, = $(A, — 2B, — Ag + 2Bg), 
yy = —A,st+B,; (j = 1,...,5), 
xX. = >> (A, — B,), 


¥ = As —_ 2Bg. 
These are rational integers whenever A, and B, satisfy 3.3. Thus 3.4 is proved. 


The corresponding quadratic form, being the norm of 2, is 
Dd (Bx, + Avy + ye) (Bx, + Xyy + yo) 
+ (3Arxe + AD yy — yo) (Bixe + ADO vy, — ye), 








388 H. S. M. COXETER AND J. A. TODD 


where the © indicates summation from | to 5. Since \ + X = Ai = 3, this is 
equal to 


9> x + 9D xy 9, + 6D5 x, ye + 3D. ys +3D. yy ye + Sys” + 270" 
+ 18x6>0 yy — 9xe ye + 3(D ys)” — 3D vive t+ yo 
=9> xf + 9D) x,y, + 6D, xy 6 + 27x68 + 18> yy xe — Mxe ye +3 >> yy2 
+ 3( 20 ys)" + bye 
= 9D (x, + hy, + bye)’ + 27 (ee + 4D ys — bye)” + ED + hyve 
= {>> (Gx, + 3y, + 2ye)* + 3(6x6 + 2>> y, — ye)? +3. yy, + 07} 
= 3K. 


The form Ky: has integral coefficients, and represents 2 but not 1; therefore 
M = 2. The determinant is 


4 = 36° 108 . 3° (3)" _ 729° 
7 12° — \4/ ~~ 4096 
4. Proof that the new form is eutactic. We know [1, pp. 397, 401] that a 
positive definite form is extreme if it is both eutactic and perfect (in the sense 
of Voronoi (10, p. 100]), and that it is eutactic if its minimal vectors constitute 
a eutactic star. 
In the case of our form 3Ki2, where M = 6, the minimal vectors go from 


the origin to the lattice points at distance »/6. In the notation of Shephard 
[7, p. 96], these 756 points are easily seen to consist of the 486 points 


4.1 ee ad 2 m, +...+m,.=0 (mod 3) 
and the 270 points 
4.2 (3A, = 3A, 0, 0, 0, 0)’, 


where the pre-subscript 3 indicates multiplication by any power of w, and the 
prime indicates all possible permutations. 


From the symmetrical nature of the coordinates, we see at once that these 
minimal vectors 


+ (pis, ..-, Mex) 


satisfy the criterion 2.1 for a eutactic star in the unitary 6-space. By 2.2, the 
same 756 vectors form a eutactic star when regarded as belonging to the real 
12-space. Hence 


4.3 The form Ky» is eutactic. 








a 


p 


AN EXTREME DUODENARY FORM 389 


5. Proof that the new form is extreme. We know [1, p. 400] that a positive 
definite form is perfect if its minimal vectors (in the Euclidean space) do not 
lie on a quadric cone (with the origin as vertex). In the case of Ky, since we 
can show that the 135 lines joining the origin to the points 4.2 do not lie on 
such a cone, there is no need to examine 4.1. 

The six points (;A) and (—,A) in the unitary |-space (or complex line) corres- 
pond to the vertices of two equilateral triangles in the Euclidean plane (or 
Argand diagram). These are similar to the triangles 


(-1,0,1) (,1,-—1) (1,-1,0) and (1,0,-—1) (0,-—1,1) (-—1,1,90) 


in the plane {; + {2 + {3 = 0 of a Euclidean 3-space. Hence the figure formed 
by the 270 points 4.2 is similar to 


5.1 ((—1, 0, 1)°, (1,0, —1)°, (0,0,0), (0,0,0), (0,0,0), (0,0, 0))’ 


in the 12-dimensional subspace 
5.2 Mt Xe t+ Xs = Xt Xet Xe =... = Xie + X17 + Xs = 0 


of a Euclidean 18-space. (The “degree” sign in 5.1 indicates cyclic permutation 
of the three numbers within the parantheses, and the prime indicates all possible 
permutations of the six triads of coordinates.) 

Any 11-dimensional quadric cone (with the origin as vertex) containing the 


270 points 5.1, could be regarded as a section of a 17-dimensional cone, say 


18 18 
5.3 > Dd be tshs = 0 (bye = by;) 
1 1 


Direct substitution yields 135 equations for the b», such as 
— 13 — C14 + Cie + Cas — Cre — Cas = O, 
where 
Ce = Wy — by; — Dap. 
Adding the three equations 
C13 + Cis — C16 — Cra + Cae + Cos = O, 
C12 + Cis — C14 — C25 + Cas + Cas = O, 
C13 + C16 — C15 — C26 + C25 + Css = O, 
we obtain 


3ei3 + (ess + €es + 45) = 0. 


Since 123 and 456 could just as well have been any other two of the six triads, 
we deduce that all the e, are equal, say ¢» = c. Any one of the equations now 
yields c = 0, whence 

2d = by; + Dur, 











390 H. S. M. COXETER AND J. A. TODD 


and 
DD bats ie BD) De (bss + dan) £5 Se 
oe Do ow £1 oe = Dd 5D. dee So. 


It follows that every quadric cone 5.3 containing the 270 points consists of two 
17-spaces, one of which is }>¢, = 0. Hence there is no such cone in the 12-space 
5.2, and we deduce 


5.4 The form Ky. is perfect. 


Combining this result with 4.3, we have 


5.5 The form Ky. is extreme. 


6. The reciprocal form. When the real and imaginary parts of the coordinates 
f, = &, + 9,7 are interpreted as coordinates in real 2m-space, the scalar product 


YKEe+ Yay 


of two Euclidean vectors is the real part of 


} ® (E, — nf) (&5 + nf). 


Accordingly, if we define the scalar product of two complex vectors 


z= (f1,..., f.), Z = (f41,..., tn) 
to be 
6.1 [z-2’)= De, 85 
{11, p. 15 (24); cf. 12, p. 16], we see that a necessary and sufficient condition 


for the corresponding Euclidean vectors to be orthogonal is that the real part 
of this scalar product should vanish. 
It is easily verified that the twelve basic vectors 3.4 satisfy 


Rlit,-t.] = 0 eS eee 12; k#¥j+6), 
Rlit,- tye) = — 3/3 (j=1,..., 6), 
Rlit,-tr.e) = 3/3 (j =7,..., 12). 
Let L’ denote the lattice generated by the twelve vectors 
t'= it; ...,t° = its, 
t= —it;,..., t” = — its. 
Then, since 

Rit? - te] = 3/3 4 (j,# =1,..., 12), 


the lattices in Euclidean 12-space corresponding to L and L’ are reciprocal 
{1, p. 399}. 


AN EXTREME DUODENARY FORM 391 


Hence the reciprocal (or “adjoint’’) form is a numerical multiple of the 
norm of 


6 6 
) xt) + ) yt. 
i i 


This is derived from 3K, itself by changing 


into 


Hence 
6.2 The form Ky» is equivalent to its own reciprocal. 


In this respect, Ky. resembles C,, As, Da, A’, , (which is extreme when 


r > 3) and D?, (which is extreme when r > 4) [1, pp. 406, 423, 431, 434]. 
(Ai and D§ are absolutely extreme, both being equivalent to £;.) 


7. Remarks on the lattice. As we have observed before, the form Ky, does 
not represent 1. The solutions of the Diophantine equation 

Ki = N (N = Fe a © 

or 3Ki2 = 3N, are represented geometrically by the points of the lattice L 


at distance (3)! from the origin. 


When N = 2, we find the 756 points 4.1, 4.2, which are the vertices of the 
complex uniform polytope (21; 3;)* of Shephard [7a, p. 380]. They lie by 
sixes on 126 lines through the origin, and the hyperplane at infinity meets these 
lines in 126 points whose homogeneous coordinates (in complex projective 
5-space) are 

* w™*) (m, +...+m. =0 (mod 3)) 
and 
(1, —w”, 0, 0, 0, 0)’. 
These 126 points are the centres of the homologies in Mitchell's primitive 
collineation group (6; 3; 8; 9; see especially 2, p. 402]. The simplex of reference 
is one of the so-called a-hexahedra [2, p. 407]. 
When N = 3, we find the 4032 points 


(3, 0, 0, 0, 0, 0)’, + \(w", w*, w”, 0, 0, 0)’ 
and 
e(—2,w™,...,@"")’ (m, +... + ms = 0 (mod 3)). 


which are the vertices of Shephard’s polytope (2, 1; 3)*. The corresponding 








392 H. S. M. COXETER AND J. A. TODD 
configuration at infinity consists of 6 + 180 + 486 = 672 points, which are the 
vertices of the 112 a-hexahedra. 
When N = 4, we find the 20412 points 
A(w", wo”, —w", —w, 0, 0)’, 


0 (mod 3)), 


+ (— 25", —26" ,0",...,0") (m+...+ms 
6(2 — w",w",...,@")’ (mo = Lor2, my +... + m5 = mo (mod 3)). 


The configuration at infinity consists of 1215 + 1215 + 972 = 3402 points, 
which are the vertices of the 567 8-hexahedra [2, p. 408]. 


REFERENCES 


- H.S. M. Coxeter, Extreme forms, Can. J. Math., 3 (1951), 391-441. 
C. M. Hamill, On a finite group of order 6,531,840, Proc. London Math. Soc. (2), 5? 
(1951), 401-454. 
. E. M. Hartley, A sextic primal in five dimensions, Proc. Cambridge Phil. Soc., 46 (1950), 
91-105. 
- Chao Ko, On the positive definite quadratic forms with determinant unity, Acta Arithmetica, 
3 (1939), 79-85. 

A. Korkine and G. Zolotareff, Sur les formes quadratiques, Math. Ann., 6 (1873), 366-389 

6. H. H. Mitchell, Determination of all primitive collineation groups in more than four variables 
which contain homologies, Amer. J. Math., 36 (1914), 1-12. 

7. G. C. Shephard, Regular complex polytopes, Proc. London Math. Soc. (3), 2 (1952), 82-97. 

7a. , Unitary groups generated by reflections, Can. J. Math., 5 (1953), 364-383. 

8. J. A. Todd, The invariants of a finite collineation group in five dimensions, Proc. Cambridge 
Phil. Soc., 46 (1950), 73-90. 

9. ———,, The characters of a collineation group in five dimensions, Proc. Royal Soc. London, 
A, 200 (1950), 320-336. 

10. G. Voroni, Sur quelques propriétés des formes quadratiques positives parfaites, }. reine 
angew. Math., 133 (1907), 97-178. 

11. H. Weyl, Gruppentheorie und Quantenmechanik (Leipzig, 1928). 

12. ———,, The theory of groups and quantum mechanics (New York, 1931). 


n- 


wn 


a 


nn 





University of Toronto } 
University of Cambridge 








he 


)), 
)). 


nD 


re 


NUMERICAL INTEGRATION OF FUNCTIONS 
OF SEVERAL VARIABLES 


G. W. TYLER 


1. Introduction. Methods of mechanical quadrature of functions of more 
than one variable apparently have received little systematic investigation and 
the few available results are widely scattered in the literature. In this paper a 
systematic approach to this problem is given and a number of formulae are 
derived which may prove to be useful. 

It seems worthwhile to distinguish between two types of situations in which 
numerical integration may be employed advantageously. When the function to 
be integrated is defined analytically, its value at any point may be calculated 
to any desired accuracy and, in such instances, one can use methods of great 
strength (in the sense of high polynomial accuracy) at the cost of computing 
accurately a comparatively small number of values of the function, at points 
which may often be awkwardly located. Gauss’s formula for integrating func- 
tions of one variable is an example of this sort. On the other hand, when the 
function is defined empirically and the values must be measured rather than 
calculated, the accurate location of points at which values are taken may become 
difficult and less meaningful and the observed values themselves may be subject 
to a substantial error of measurement. Circumstances of this sort call for a 
formula which is based on easily located points and which is as unresponsive to 
errors of measurement as can be arranged, even though its strength may fall 
somewhat below the greatest obtainable. Formulae of both kinds are developed 
in this paper. Since the approach employed here has been used in devising 
integration formulae for single integrals {12] it may be helpful to outline briefly 
the development of some of them. 


2. Single integrals. With proper choice of origin and scale, a definite 
integral over any finite range may be written in the form 


el 
(1) [= f(x) dx. 


Received April 30, 1952; in revised form June 23, 1952. 

This paper in part summarizes a more detailed unpublished study, The Experimental Evalua- 
tion of Definite Integrals, which was done largely while the writer was a member of the Scientific 
Staff of the U.S. Navy Electronics Laboratory, and accepted as a doctoral dissertation by 
Virginia Polytechnic Institute in 1949. Dr. D. B. DeLury offered helpful suggestions in the 
preparation of this paper, and the figures were drawn by Mr. C. J. VanVliet of the U.S. Navy 
Electronics Laboratory. The writer is presently an Operations Analyst at Headquarters, 
U.S. Air Force. 


393 








394 G. W. TYLER 
It is assumed that f(x) ‘is, or may be replaced by, a polynomial of degree n, 


(2) f(x) = DAs x’ 


i=0 


With this assumption 


— A; 


(3) T= 2 i 


Let an approximation to / be given by 


(4) I, = >> Rava (m <n), 
a=() 
where the R’s are constants (weights) to be determined and the ye = f(x) are ’ 


calculated or observed values of the function. 
The difference, 


(dD) E= J], — I, 


will be called the polynomial error. 
Expanding the right side of (5) and equating to zero the coefficients of the 
A's, we are led to the system of equations 


m , 1 
(6) 2 Ra Xe abe (i =0,1,...,n). 


When the x values are chosen equally spaced over the interval of integration, 
including the end points, and m = n, the system of equations (6) leads directly 
to the Newton-Cotes integration formula. The values of the R, determined by 
the equations are the Cotes numbers. 

Clearly the arbitrary assignment of abscissa values always leads to a problem 
of the same type, whose solution depends on a system of linear equations. 

When the x values are not chosen arbitrarily but are selected to satisfy as 
manv as possible of the set of equations (6), and perhaps other conditions as 
well, the equations are no longer linear and the solutions are more difficult to 
obtain. Some slight simplification may be accomplished by noticing that if a 
set of values x. (a = 0,1,...,m) isa solution of (6), so also is the set (1 — xq). 
To take advantage of this symmetry, we can write 

1 1 
(7) [= F(x) dx = 2 f Q(x) dx, 
7/0 


where Q(x) is the even part of F(x). Placing 


I, = p R.| F (xa) + F(—xe)] 








NUMERICAL INTEGRATION WITH SEVERAL VARIABLES 395 


and proceeding as before, we obtain the set of equations, analogous to (6), 


m 
8) Y Rex = 5 (= 0,1,...,n). 

If we determine the R, and x, so that the first 2m + 2 equations of (8) are 
satished, we obtain the Gauss formula, which has the highest possible poly- 
nomial accuracy. Some indication of the circumstances in which this formula 
might be deemed inappropriate is provided by the opinion of Gauss, who wrote 
that the x values should always be expressed in sixteen decimals to insure no 
error in the first 2m terms of (7). It is doubtful, therefore, that a formula of 
this kind would be useful with experimental data. 

Errors in the experimental determination of F(x.) would usually be expected, 
even if the x. could be located without error. Often it is reasonable to assume 
that these errors are independent and have constant variance. It then follows 
easily that a formula with equal weights is the least responsive to these errors. 
Tchebichef'’s formulae were constructed to satisfy this condition and in applica- 
tions, as with the Gauss formulae, it is necessary to determine function values at 
points that must be located with high accuracy. 


3. The integral over a rectangle of a function of two variables. The integral 
to be evaluated will be written 


(9) [= f f F(x, y) dx dy, 

and it will be assumed that 

(10) F(x, y) = YD Aus'y’ i+j < 2n. 
Let 

(11) I, = sab > RaF (Xa; Ya): 


Substitution of the polynomial form of F(x, y) in both J and /, yields the 
relation 


(12) h-I= tad| Aul 5: Ra — i) + AW = Rex) ae 


a=1 


™m 24,27 
J. art iteri) 
+ Anh Mote te ~ G+ ls iy ** 


m a b™ 
+ Aan( 5 Rex —_ Qn + -) I 


Equating to zero the coefficients of A,, we are led to the system of equations 











396 G. W. TYLER 


gor, a‘ b? i . 
(13) 2. R, Xa Va = Gi +1G +1) . ,J both even, 


= 0, otherwise. 


The problem of devising integration formulae thus becomes the problem of 
finding solutions to sets of equations drawn from (13). For example, for m = 4, 
a solution of the first three equations yields the obvious rule, analogous to the 
trapezoidal rule: 


a bo 
(14) f F(x, y) dx dy = ab| F(a, b) + F(a, —b) + F(—a, 6b) + F(—a,—5)). 
—a J» 
This rule is exact when F(x, y) is linear. If F(x, y) is a polynomial of degree 
2 or 3, the error committed is easily calculated to be 


(15) I, -—1 = E' = jab(Anwa’ + Av 3d’). 


Even if F(x, y) is not of degree 3, the magnitude of the error introduced by 
those terms of degree one higher than that for which a formula is exact may 
occassionally be useful in selecting an appropriate formula. A value of £’, 
defined in this manner, is therefore attached to each integration formula. 

When m = 5, the first ten equations of (13) can be satisfied by the following 
values: 


Ri=i, Re=i, R=} R, =j Rs =3, 
%, = a, x, = 0, x3=-a x, =O x, = 0, 
n= 0, Y= b, J: = 0, » | elie b, Je = 0. 


The resulting formula will be called the first five-point third degree accuracy 
formula: 
(16) f f F(x, y) dx dy = j ab|2F(0, 0) + F(a, 0) + F(—a, 0) + F(, d) 
J _a — 
+ F(O, —5))}, 
E’ = a ab(6A 40 a‘ = 5A 22 a’h’ + 6A o4 b*). 


Another set of solutions to the same equations, obtained by taking points at 
the centre and corners of the rectangle, yields the second five-point third degree 
accuracy formula: 


a db 
(17) f F(x, y) dx dy = j ab[8F(O, 0) + F(a, b) + F(a, —b) + F(—a, b) 
—a Vv—d 
+ F(—a, —b)], 


E’ = §£.ab(3Aw a’ + 5Ax a’d* + 3Aqy 5*). 








NUMERICAL INTEGRATION WITH SEVERAL VARIABLES 397 


If an area of integration can be broken down into a number of rectangles of 
equal dimensions, the first five-point formula can be applied to each rectangle 
and the results added, to furnish a simple rule, similar in nature to Simpson's 
rule. Each interior point will have a weight of 2, either because it is at the centre 
of a rectangle or because it is on the boundaries of two rectangles, while each 
point on the perimeter of the area will have unit weight. Thus, if there are 
p perimeter points and g interior points the integral over the total area is 
given by 
(18) ff F(x, y) dx dy = {otal area of rectangles 

p + 2q 


perimeter points) + 22(function values at interior points) }. 





[2(function values at 


This rule gives near equal weighting to the function values which may, in some 
applications, be desirable. 

The second five-point formula, similarly used, leads to another formula of 
the same kind. The details need not be given here. 

A continuation of this approach to obtain formulae of higher polynomial 
accuracy becomes tedious. Some simplicity may be gained by taking advantage 
of symmetry. For a fifth degree function, 


I = 4ab(Aw + jA2 a" + $Ao2 b” + fAwa' + fAna’b’ + [Anu d'). 
which may be written as 
(19) I = §£ab(45M + 15N + 9P + 5Q) 

M = Aw, N = Awa’ + And’, P= Awa'+Aubd’, Q=Ana’d’. 

We wish to find values of F(x, y), at a number of symmetrically and con- 
veniently located points, which properly weighted and summed, will be equal 
to the value of the integral. If we choose the centre of the rectangle, (0, 0), the 
centres of the four sides, (0 + 6) and (+ a, 0), and the four corners, (+ a, +d), 
a direct calculation shows that F(0,0) = M, the sum of the values at the 
centres of the sides is 4M + 2N + 2P and the sum of the values at the four 


corners is 4(M + N + P + Q). If these three sets, properly weighted, are to 
furnish a value for the integral, we must have identically in M, N, P, Q, 


(20) 
45M + 15N + 9P + 50 =aM + B(4M + 2N + 2P) + 4y5(M4+N4+P4+Q). 


Hence 
a+ 48 + 4y = 45, 


28 + 4y = 15, 
28+4y= 9, 
4y = 5. 








398 G. W. TYLER 


Since this set of equations is inconsistent, it is impossible to obtain fifth degree 
accuracy using this set of nine points or, apparently, any other similarly selected 
set of nine points. 

If, in addition to the nine points considered above, we take the four points 
midway from the centre to the midpoints of the sides, a solution is obtained. 
The sum of the function values at these four points is 4M + 4N + 4P which, 
given a weight 6 and added to the right side of (20), produces a consistent set 
of equations 


a+ 48 + 4y + 46 = 45, 
28 + 47 + 36 = 15, 
28+ 47+ té= Q, 

4y =» & 


The solutions are 


a= —28, B=1, 7y=5/4, 6= 16. 


The positions of the points and the corresponding weights are shown in 
Figure 1, following a scheme used by Bickley [2]. It is easily verified that the 
solutions obtained here satisfy the first twenty-one equations of (13) with 
m = 13. 

FIGURE 1 


Thirteen-Point Fifth Degree Accuracy Formula for Double Integrals 


0) 














© 
©O®O®O® 

© 

© 


—©) 
a o 

(21) f F(x,y) dxdy = gab] — 112, +40 F.+5>. F:+ 64>, F, | 
—a —d 


where 





NUMERICAL INTEGRATION WITH SEVERAL VARIABLES 399 


F, = F(0, 0) 
> F: = FO, 6) + F(a, 0) + F(O, —b) + F(—a, 0) 

D F: = F(a, b) + F(a, —b) + F(—a, b) + F(—a, —b) 
DX Fs = FO, $6) + F(4a, 0) + FO, — 46) + F(—4a, 0) 


and 
E’ = i ab(A eo a’® + Avs b°) + ri ab(A 42 a‘h* + Ax a’b*). 


The approach used in reaching the thirteen-point formula, (21), may be con- 
tinued without alteration to establish the following twenty-one-point, seventh 
degree accuracy formula. 

FIGURE 2 
Twenty-One-Point Seventh Degree Accuracy Formula for Double Integrals 


49 fu) 49 
ar 
S 
©2e8ee@ 
696 











a ) 
(22) f F(x, y) dxdy = siz ad| 5388, + 111 > F,+ 49>> F; 
vq v7/—b 


+ 405 >> F, + 896 >> F, — 1863 >> Fe | 


where F, = F(0,0) 
> F: = F(O,b) + F(a, 0) + FO, —b) + F(—a, 0) 
> F: = F(a, b) + F(a, —b) + F(—a, —b) + F(—a, b) 
dX Fi = FO, ib) + FGa, 0) + FO, —3b) + F(—4a, 0) 
DX Fs = F(4a, $b) + F(4a, — $6) + F(—4a, — 4) + F(—4a, 40) 


F(0, 3b) + F(ja, 0) + F(O, —{b) + F(—ja, 0) 


M 
= 
i 








400 G. W. TYLER 


E’ = gers ab(A goa” + Aos b*) + @ab(Anal + Asad’) + weabAya'd*. 


If sets of symmetrically located points are used, with all members of a set 
retaining the same weight, some further simplification can be had by introducing 
the function 


(x, y) = i[F(x, y) + F(x, —y) + F(—x, y) + F(—x, —y)] 


24 


Ao + Anx* — Any Yo + Ax 2jX° 


yy see 
Then 


a b va ed 
f f F(x, y) dx dy = 4 | j }(x, y) dx dy. 
—a —>b JO 90 


The adoption of $(x, y) leads to the system of equations 


m 2i,27 
22) a — | 6 -_ 
(23) 2, Ra xe ve * Git+ DQ@F1)’ 
for all z, 7 for which i + 7 < 2n. For m = 1, 


2 


2 1 2 1 
R,=1, x = 76¢, Fi = 3b. 


Thus we have the following four-point, third degree accuracy formula: 


(24) f i F(x, y) dx dy = ab| F(a/V/3, b/-V/3) + Fla/V3, —b/V3) 


+ F(—a/¥V3, b/V/3) + F(—a/V3, —b/V3)] 
and 
E’ = is ab(A 40 a‘ o- Au b*). 


This formula, which has the same polynomial accuracy as the five-point formulae 
developed earlier, has the merit that all function values are weighted equally. 
A formula constructed from this one by adding over a set of elemental rec- 
tangles would also have this property. 


For m = 2 there is no solution to the first six equations of (23). It is therefore 
impossible to obtain fifth degree accuracy using two sets of four points sym- 
metrically disposed in this manner. 


If we put m = 3, x; = 0, y2 = 0, we obtain the equations: 








NUMERICAL INTEGRATION WITH SEVERAL VARIABLES 401 


R, +R, +R =1, 
Rv + Rox?’ = ja’, 
(25) Ry + Ry; = b’, 
Rx + Rx: = ja", 
Rixiy = ja’b’, 
Ry" + Riya = 7b. 


rhe following values constitute a solution to this set: 


9 20 
Ri = R; =R; = 49 


2 7S 2 7 2 
xX; = 90a X22 = Ta 
2 7,2 2 7 32 
yi = 9b ¥s = 155 


This leads to the eight-point formula (26) for fifth degree accuracy. The points 
and weights are shown in the following table: 


a l 2 3 4 5 6 7 8 
196R, 9 9 9 9 40 40 40 40 
Xa "i a 7 a -‘5 a —%¥ a Vi a- Vi a 0 0 
Ya 36 — Yb 7b —- Yo 0 o Vib -vVab 


Points 5 and 6 may be interpreted as four points which have become coincident 
in pairs, and hence the function at each of these points would have double the 
weight indicated by R, in the solution of (25). This interpretation also holds 
for points 7 and 8. 


For (26) (see next page), 
E’ = airs ab[— 53(Aeoa® + Aosd’) + 70(Aawa'd + Anad’)). 


The above formula, written for integrating over a square of side 2 units was 
given by Burnside in 1908 [3]. He gave no details of its derivation but stated 
that it was constructed by a procedure closely similar to that which gives 
Gauss’s two-point third degree accuracy and three-point fifth degree accuracy 
formulae for single integrals. Burnside illustrated the use of his formula by 
approximating the value of the two integrals: 


» CSati=n © LLostty 











402 G. W. TYLER 


FIGURE 3 


Eight-Point Fifth degree Accuracy Formula for Double Integrals 














(26) f J F(x, y) dx dy 
= %ab[9>> F(+ Ya, + Yb) + 40> F(+ Vika, 0) + 405 FO, + V3 8)) 


where it is understood the summations extend over all distinct combinations 
of signs. 


The exact values of the integrals (i) and (ii) are 

(i) $x(1 — 1/73), (ii) r(1 — 1/+/2), 
which, reduced to 4-figure decimals, are 

(i) 0.6639, (ii) 0.9202. 


Burnside gives the values of these integrals, as calculated from the formula 
(26) as: 


(i) 0.6641, (ii) 0.9262. 


He points out that in the second integral the conditions are unfavorable for 
applying the approximation formula since both first partial derivatives of the 
radical in this integral increase without limit as the point x = 1, y = 1, is 
approached. 

The integrals (i) and (ii) were used also by Aitken and Frewin [1] to obtain 
a rough numerical check on some of the formulae for double integrals which 
they developed. 

If we return to (23) and put m = 4, 








NUMERICAL INTEGRATION WITH SEVERAL VARIABLES 403 
the first ten equations yield the solutions: 


= U1 V1 — 3588 _ 0380555; Ri 


178981 + 27694/583 














a 287 472230 

= 0.520593; 
zs n Ys _V 114 + 888s = 0.805980: R, = 17808 Ero oe 

= 0.237432; 
a. 6 = 0.925820: sie _ 49 = 01 


This leads to the twelve-point seventh degree accuracy formula (27). 


FIGURE 4 


Twelve-Point Seventh Degree Accuracy Formula for Double Integrals 


®) ” ®) 











(*:) (rR) 


~~ 





(27) J f F(x, y) dx dy = ab[Ri >, F( + x1, + y1) + Rod, F( + x2, + ys) 
+ 2R;>> F( + xs, 0) + 2R, >> FO, + y)). 
The remainder error is: 
E’ = ab[— 0.013184(A go a* + Aos 6°) + 0.020441(A 2 a°b® + Axe a’d’) 
— 0.010035 A 4, a*b‘). 


For the value of the integrals which Burnside used as a rough check for his 
formula, the above twelve-point formula gives (i) 0.6639 and (ii) 0.9161. The 
approximations are seen to be better than the approximations for these integrals 
from Burnside’s formula, though the approximation for (ii) is still in error by 
t units in the third significant figure. 








404 G. W. TYLER 


4. Relative merits of the eight- and thirteen-point formulae and the twelve- 
and twenty-one-point formulae. If F(x, y) is of the fifth degree, there are 
twenty-one coefficients (A ;, ;). This means that there are twenty-one disposable 
constants, which can be used, except in special cases which we shall not discusss 
here, to make F(x, y) pass through twenty-one points of, or satisfy a variety 
of other conditions with respect to, an experimentally obtained function. In 
statistical terms, this function has twenty-one degrees of freedom. It is evident 
then that the eight-point or the thirteen-point formula, with its respective 
number of measurements, in so far as the integration is concerned will dispose 
of these twenty-one degrees of freedom without error. In the problem of estima- 
ting the value of the double integral of a function taken over a single rectangle, 
the eight-point formula is 13/8 as efficient as the thirteen-point formula in 
controlling the polynomial error. If, however, we consider applying these 
formulae to a large number of equal-sized elemental rectangles, we see that this 
advantage of the eight-point formula is decreased, though apparently for all 
shapes of areas it will exist, at least to a small extent. The advantage of the 
eight-point formula decreases, of course, because the points located on the 
perimeter of the elemental rectangles may be coincident for two, three, or four 
of these rectangles. A situation favourable to the thirteen-point formula in 
this respect, occurs in the problem of estimating the integral over a rectangle, 
which, to increase the accuracy, has been subdivided into nm? smaller rectangles 
similar to the original. The eight-point formula would require 8n? function 
evaluations, compared with 8? + 4n + 1 evaluations for the thirteen-point 
formula. It follows that for 2 = 5 an increase of about 10 per cent in the number 
of function value determinations would be required to apply the thirteen-point 
formula, but for » > 50, the corresponding increase would be less than 1 per 
cent. 

In addition to the matter discussed in the last paragraph, it is evident that 
application of the eight-point formula would result in weights (R,) for each 
point which would be more nearly equal than the weights that would result 
from applying the thirteen-point formula. On the other hand, the location of 
the thirteen points could be described more simply and perhaps in some prob- 
lems actually located with less error than will be the case with the eight points. 

A very similar situation to that just discussed exists in regard to using the 
twelve- or twenty-one-point formulae. The twelve-point formula which disposes 
of the effects of thirty-six coefficients is highly efficient in controlling the poly- 
nomial error when applied to a single rectangle. One can readily envisage 
conditions under which it would seem advisable in the same problem to use a 
combination of different-sized rectangles and formulae of different degree 
accuracy. 


5. Triple integrals over rectangular regions. Formulae for triple integrals 
can be developed by a natural extension of the methods used in the previous 
section. Let 








n 


NUMERICAL INTEGRATION WITH SEVERAL VARIABLES 405 


(28) fe a f f Flats, ste, téa) dita dite, de. 


Continuing as we did for double integrals we obtain the system of equations 


— “oo. er ee ay‘ a2’ as" : 
(29) 2, Ra x10 Xta Xta “G+)DG+ DEF)’ 


= 0, for i, or j, or k odd 


for i, j, k all even, 


Grouping the points in sets of 8, one in each octant, we obtain a simpler system 
than (29). One or both of these systems can, perhaps be employed advantag- 
eously in deriving formulae for higher degree accuracy, or formulae for other 
special purposes. 

If we assume F(x,, x2, x3) is a third degree polynomial and integrate (28) 
directly we obtain: 


(30) [= 2°a10203[A oo0 + ; (A 200 ay + Aozro a: + Aoos a;°)). 


By considering the values of F(x,, x2, x3) at the centre of each of the faces of 
the parallelepiped as shown in Figure 5, we find that the volume of the integra- 
tion space multiplied by the mean of these six values is identical with (30) 
Hence we have the following six-point formula for third degree accuracy: 


FIGURE 5 


Six-Point Third Degree Accuracy Formula for Triple Integrals 




















ae | , 
owe Oe. , 
nee” ~ / 
——— ~ / 
/ 
se 4 ‘ / 
~ so” ‘ / 
~ s 
‘ Pad ‘ / 
% 4 . / 
. ar \ 
. , 
y 
‘. 3 rn 
~~ ai / ‘ 
, 4 / ‘ 
* 
‘ / ‘ 
id *\ / . 
Pv ~ / 
P ~ j 
a *. / 
4 . / 
4 * 
4 ~ / 
a, a as 
(31) f F (x1, x2, x3) dxy dx, dx: 
—Gi —@Gs —~as 


ll 


$1420, > F( + a;,0,0) + © FO, + a2,0) + > F(O,0, + as) | 
E’ = ts a) 2 a3 [6(A 400 a,‘ + A oso a," + Aoos a;') 


2 3 2 3 2. 3 
— 5(Ao.0 a1 a2 + Aoo2d1 As + Aoze G2 az )}. 








406 G. W. TYLER 


In a similar manner, by considering the value of the function at the corners 
and centre of the parallelepiped, we obtain the following five-point formula for 
near third degree accuracy: 


(32) f J F(x, X2, X3) dx, dx- dx, = 15 2; Ge a3 [8F(0,0,0) 


+ F(a;,a2,a3) + F(—4a,@2,—a3) + F (ai, —@2,—a3) + F(—4a1,—43,@3)). 


Using appropriate R’s and coordinates as indicated by (31), it is found that 
the first twenty equations of (29) for m = 6 are satisfied. Using the R’s and 
coordinates as shown by (32), we find that nineteen of the first twenty equations 
of (29) for m = 5 are satisfied. The only term less than fourth degree which 
contributes an error when using (32) is the x; x2x; term. If the coefficient 
Ain of this term is available, then subtracting } Ai: @;* a2" a3" from (32) will 
eliminate this error and enable us to make a full third degree precision estimate 
from these five points. 

We can obtain a nine-point third degree accuracy formula by considering 
the centre and all eight vertices of the parallelepiped as shown in Figure 6. 
This formula is written as (33) and while it does not control the polynomial 
error as efficiently as either of the two preceding formulae, it gives a different 
coverage of the integration space and in certain problems it can be employed 
advantageously. 


FIGURE 6 


Nine-Point Third Degree Accuracy Formula for Triple Integrals 








) 





\ 
hon 














(33) j : | J F(x, Xe, X3) dx dx» dx» 


= 34: 026s [16 F(0, 0,0) + 5 F( + a1, + as, + as) |, 


E’ ig @1 a2 a3 [3(A 400 a," + Aoso a," + Aoos a3‘) 


2 2 2 2 2 2 
+ 5(A220 a1 a2 + A2o2 a: a3 + Aor a2 az )). 








NUMERICAL INTEGRATION WITH SEVERAL VARIABLES 407 


If we seek greater accuracy and consider the twenty-one points which as 
shown in Figure 7 are located at 

(i) the centre of the parallelepiped, 

(ii) the six midpoints of the segments joining the centre of the parallelepiped 
to the centre of each face, 

(iii) the six centres of the faces, 

(iv) the eight vertices, 
we obtain formula (34) which has fifth degree accuracy. The details of the 
derivation will be omitted but proceeding as we did in developing the thirteen- 
point formula (21) we can derive (34) as the result of solving only four linear 
equations. 


FIGURE 7 


Twenty-One-Point Fifth Degree Accuracy Formula for Triple [Integrals 




















a asl 5 
| | 
| - 
© | | 5 
| e128 ue 
ee 
| ~fee 
eens °—— spt —— ine wt ——-*—}- 
_ M28 | 
Ae | 
5 128 
#--TS—--1----- ; 
a“ 
Yd | 
- * 
y 
7 l 


(34) f . f F(x, X2, X3) dx, dx. dx; 


n i a1 a2a3| — 496 Fi + 128572 +8D74+5E P| 


F, = F(O, 0,0), 
> F: = sum of values of the function at the 6 points located midway 
from the centre of the parallelepiped to the six faces, 
> F; = sum of values of the function at the 6 centres of the faces, 
> F, = sum of values of the function at the 8 vertices. 


A feature which limits the usefulness of this formula in applications where 
the measurement error is heavy is the large negative weight of F,. This can be 








408 G. W. TYLER 


improved somewhat by adjusting the position of the six points represented by 
>X F:, but the negative weighting cannot be eliminated in this way and it is 
doubtful if a more useful formula will result from such an adjustment. The 
general ternary quintic has 56 terms each of which might contribute an error in 
estimating the value of the triple integral and thus formula (34), which utilizes 
only twenty-one points, has high efficiency for controlling the polynomial 
error. 

It is clear that rules can be developed, based on any one of the last four for- 
mulae, for estimating the triple integral of a function over a domain which has 
been subdivided into elemental parallelepipeds. In view of the equal weighting 
for the points and the general simplicity of (31), it appears that such a rule 
based on this formula would possess the greatest practical merits. 

Sadowsky [15] developed the following 42-point formula: 


f f f u(x, y,2) = as [9 > us — 40> wie + 16> ur, 


where >> ws denotes the sum of the six values of u(x, y, z) determined at the 
centres of the six faces of the cube, 

> #12 denotes the sum of the values of u(x, y,2) at midpoints of the 
twelve edges of the cube, 

> we denotes the sum of the twenty-four values of u(x, y, z) at the four 
points on the diagonals of each face and at a distance of $+/5 from the centre 
of the face. 

This formula has fifth degree accuracy and the points are all located on the 
surface of the cube. Sadowsky concludes that 42 is the smallest number of 
points that can be used to achieve this accuracy under the restraint that the 
points must lie on the surface. He also points out that the sixth degree function 
F(x, y, 2) = (x? — 1)(y? — 1)(2* — 1) vanishes at all points on the surface of 
the cube and hence it is impossible in general to attain as high as sixth degree 
accuracy under the above restraint. 


6. Generalization for first degree and third degree accuracy. The possibi- 
lities of writing formulae with a given degree accuracy for any number of vari- 
ables have not been explored extensively, but it is evident that some of the 
formulae of the preceding sections are special cases of more general formula: 
that can be written. Let us consider: 


(35) [ = f on a Z_) dx... dx 
where F(x, ...,X,) can be expressed in series form by 


(36) Plxs,..-, we) = >... ¥ Ae...0. Ki”... em 





NUMERICAL INTEGRATION WITH SEVERAL VARIABLES 109 


for all a, for which 
> a; < N. 
i=] 


If F(x:,...,X,) is linear, all the coefficients, except the first (the constant), 
will be neutralized in the successive integrations and we have immediately 
the following formula for first degree accuracy : 


(37) 1 =2°T] a, FO,0,...,0). 
i=) 


In terms of n-dimensional geometry, the almost trivial result (37) simply 
asserts that the integral of any linear function taken over a rectangular domain 
is the product of the “‘volume”’ of the integration domain and the value of the 
function at the centre of this domain. 

If we assume F(x,;,...,X,) is a third degree polynomial, then by direct 
integration of (35) we obtain 


(38) f= 2*[] a, [Aoo...o + 3 (Azo..0a1° + Aoz...ods +... + Aoo..2s')). 
t=1 


It is evident that the expression in brackets in (38) is a weighted average 
of the value of the function at the centre of the integration space and at the 
“centres of the faces’ of this space. Equation (39) gives the weighting which 
for all positive values of m yields (38) and is therefore a 2n + 1 point formula 
with third degree accuracy for integrating over a rectangular n-space. 


(39) J= 2 [] 2, {[(6 — 2n)F(0,0,...,0) + F(a;,0,..., 0) 
t=—1 
+ F(—a,,0,...,0) + F(O,ae,...0)+...+ FO,0,..., —a,)]. 


7. Orthogonal polynomial methods in evaluating multiple integrals. The 
methods of orthogonal polynomials are useful in estimating both the observa- 
tional error and the integral of functions of two or more variables. The poly- 
nomial 


N N 
Z = f(x,y) = } a basx"y’, at+B<wN 
a=0 B=0 


can be rearranged and written as 


N N 
(40) Z = DD Bas te’ (x) &e'(y), atB<Nn 
where £,’(x) and &@’(y) are orthogonal polynomials of degree a and 8 in x and y 
respectively. In problems where the statistical error is relatively great, a realistic 
and effective approach is provided by fitting (40) as a regression surface to the 








410 G. W. TYLER 


experimental or computed values, z,,;. If (40) is fitted by least squares to a set 
of values, z,,, the coefficient B,, is given by 


D Dd Bro’ (xsdEe’ (ys) 
(41) ee Se anaes 
> (&'(x) Dd. lv)! 





The reduction in residual sum of squares attributable to B,, is the product of 
B,, and the numerator of this quantity as given by (41). 

The arithmetic necessary for these calculations can be greatly reduced by 
using tabulated values of the orthogonal polynomials provided the observations 
are made at equally spaced x and y values. Moreover, the value of the double 
integral over any rectangle can be estimated by easy calculations using tabulated 
values for integrals of the orthogonal polynomials. The network of equally 
spaced observation points can either include the boundary of the integration 
rectangle or correspond to points at the’centres of elemental rectangles within 
the integration rectangle. 

The details of all these calculations along with an example are given by 


DeLury [4]. 


8. Double integrals over curvilinear bounded areas. In problems requiring 
integration over an irregularly bounded area, one can see possibilities of obtain- 
ing a more accurate and efficient approximation by the use of formulae which 
involve variable limits for the integrals. Though it is evident that the com- 
plexities increase rapidly as we allow the bounding surface and cylinder greater 
freedom, the following two formulae can be developed quite simply. 

Let 


a 2 b( 1—z* /a*) 
(42) [= f J F(x, y) dx dy. 
—a 0 


Geometrically, J represents the volume under the surface F(x, y) and bounded 
by the parabolic cylinder y = 6(1 — x*/a’) and the xy and x F(x, y) planes. 
If we select the five points shown in Figure 8 and proceed in a manner closely 
analogous to the procedure for developing the thirteen-point rectangle formula 
(21), we find that we can achieve second degree accuracy for F(x, y) in terms 
of these points. 

Seeking greater freedom for F(x, y), we can gain simplicity by considering 
the doubly symmetrical integral: 


a d(1—z* /a*) 
(44) f f F(x, y) dx dy. 
—a —bd(1—z* /a*) 


Taking advantage of the symmetrical location of the points we can group the 
thirteen points shown in Figure 9 into six groups and derive, as the result of 








NUMERICAL INTEGRATION WITH SEVERAL VARIABLES 411 


solving a set of six (we now see it could have been done with five) linear equations 
for the weights, the thirteen-point parabolic formula (45), which has fifth 
/ degree accuracy. 
FIGURE 8 


Five-Point Second Degree Accuracy Formula for Double Integrals over 
Parabolic Regions 








y 
} 
4 
) $48 
42 4 >! x 
; e b(1—z? fa) 
(43) f f F(x, y) dx dy = stp ab [4F(0, 0) + 4F(0, 5) 


+ 7F(—a,0) + 7F(a, 0) + 48F(0, $5)], 


EF’ = -—- Wn ab (6A 21a°b + 17A o9b’). 


REFERENCES 


1. A. C. Aitken and G. L. Frewin, The numerical evaluation of double integrals, Proc. Edin- 
burgh Math. Soc., 42 (1923). 
2. W. G. Bickley, Finite difference formulae for the square lattice, Quarterly J. Mech. and 
Applied Math., 7 (1948). 
3. W. Burnside, An approximate quadrature formula, Messenger of Math., 37 (1908). 
. D. B. DeLury, Values and integrals of the orthogonal polynomials up to n = 26 (Toronto, 
1950). 
5. Carl Frederick Gauss, Werk, vol. 3 (Géttingen, 1876). 
- J.O. Irvin, On quadrature and cubature, Tracts for Computers, No. 10 (Cambridge, 1923). 
7. W. Woolsey Johnson, On Cotesian numbers; their history, computation and values to n = 20, 
Quarterly J. Math., 46 (1946). 
8. LaGrangian interpolation coefficients, Mathematics Tables Project (New York, 1944) 
) 9. J. Clark Maxwell, On approximate multiple integration, Proc. Phil. Soc., 3 (1880). 
10. Leroy R. Meyers and Arthur Sard, Best approximate integration formulas, J. Math. Phys., 
29 (1950). 
11. W. E. Milne, Numerical calculus (Princeton, 1949). 
12. B. P. Moors, Valuers approximative d'une intégrale définie (Paris, 1905). 


> 








412 G. W. TYLER 
FIGURE 9 


Thirteen-Point Fifth Degree Accuracy Formula for Double Integrals over 
g 
Parabolic Regions 


768 e704 











*e 0(1—z* /a*) 
(45) J f F(x, y) dx dy = sab 301344 F (0, 0) + 248F(0, + d) 


—d(1—r* /a*) 


+ 768F(0, + 4b) + 165F( + a, 0) + 704F( + 4a, 0) + 704F( + 4a, + 40), 


E’ = — jab [(%} Awa’ + as (Aaah? + An a’b*) + sit Ave b'). 


Attractive features of formula (45) are the simple position of the points and 
the near equality of weighting for all these points. 


13. B. P. Moors, Etude sur les formules (spécialement de Gauss) servant @ calculer des valeurs 
approximative d'une intégrale définie, Verh. Akad. Wet. Amsterdam, 11.6 (1913). 

14. A. L. O’Toole, On the degree of approximation of certain quadrature formulas, Ann. Math 
Stat., 4 (1933). 

15. Michael Sadowsky, A formula for approximate computation of a triple integral, Amer. 
Math. Monthly, 47 (1940). 

16. Gabor Szego, Orthogonal polynomials (New York, Amer. Math. Soc. Colloquium Publica- 
tion, vol. 23, 1939). 

17. M. P. Tchebichef, Sur les quadratures, J]. Math. pures appl., 19 (1874). 

18. E. T. Whittaker and G. Robinson, The calculus of observations (London, 1937) 

19. E. T. Whittaker and G. N. Watson, Modern analysis (Cambridge, 1946). 


United States Air Force 
Washington 25, D.C. 





THE NON-EXISTENCE OF CERTAIN AFFINE 
RESOLVABLE BALANCED INCOMPLETE BLOCK 
DESIGNS 


S. S. SHRIKHAN DE 


1. Summary. A method of proving the impossibility of certain Affine 
Resolvable Balanced Incomplete Block Designs (A.R.B.I.B.D.) has been 
given by the author elsewhere [9]. More complete results in the same direction 
are obtained here using the ideas of a paper by Connor [4]. 


2. Preliminary results. A Balanced Incomplete Block Design (B.1.B.D.) 
with parameters v, 5, r, k, and d is said to be affine resolvable if the 6 blocks can 
be separated into r sets, each forming a complete replication such that any 
two blocks of different sets have the same number of treatments in common. 
It has been shown [1] that the parameters of such a design can be expressed in 
terms of two integers x and ¢ (m > 2, ¢t > 0) in the following manner: 


2.00 v= nk = n'[(n — 1)t + 1], b = mr = n(n't+n-+ 1), A= nt+ 1. 


Further any two blocks of the same set have no treatment in common, whereas 
those from different sets have exactly 


—=(n—1)t+1 


treatments in common. 

Let A be a symmetric matrix of order m with elements in the rational field. 
Then A is said to be rationally equivalent to B, A ~ B if and only if there 
exists a non-singular matrix P with elements in the same field such that 
B = P’AP, where P’ is the transpose of P. The equivalence of matrices satisfies 
the requirements of an ‘“‘equals”’ relationship. 

Consider the Hasse invariant 


2.01 c,(A) = (—1, ~Da)o TI (Dj, —D 441)» 
j=l 


where p is a prime, D, is the leading principal minor determinant of order 7 in 
A and (a, 5), is Pall’s [6] generalization of the Hilbert norm residue symbol. 
Let « = index of A, and d = the square free part of A. Then we have 


THeoreM A. Let A and B be two non-singular matrices of order m with elements 
in the rational field. Then A ~ B, if and only if A and B have the same values for 
the invariants i, d, and c, for every prime p. 


Received August 17, 1952 


413 








414 S. S. SHRIKHANDE 


The following useful properties of the Hilbert norm residue symbol are quoted 
from [3] for the sake of completeness. They have been used in the following 
section. 


THEOREM B. If m and m'’ are integers not divisible by an odd prime p, then 
(m, m'), = 1, 
(m, p)p = (p,.m), = (m/p) 
where (m/p) ts the Legendre symbol. Moreover, if m = m' # 0 (mod )), then 
(m, p)y = (m', P)». 
THEOREM C. For arbitrary non-zero integers m, m', n,n’ and for every prime p, 
(— m,m), = 1 
(m,n), = (n, m),, 
(mm’,n), = (m,n),(m', n)>». 
Further for p an odd prime and every positive integer m, 
(m,m +1), = (— 1,m + 1),. 
. 


The results in the remaining part of this section are due to Connor [4]. Let 
N be the incidence matrix of v rows and } volumns, i.e., the elements m,, in 
row j and column « is 1 or 0 according as treatment j does or does not occur in 
block u. Let the matrix N be augmented to the matrix N, of v + / rows and b 
columns where 


2.02 —— — 





and J, is the identity matrix of order / and 0 is a matrix with all elements zero. 
Then 
NN’ N, 
2. N,NY/ = = " 
os = N; Ii 


where JN, is the submatrix of the first / columns of NV. Obviously 


a Se | 
2.04 wet *"---* 
AA r 


and hence 


2.05 |NN’| = kr(r — 2d)”. 





al 


fc 


i) 





BALANCED INCOMPLETE BLOCK DESIGNS 415 


From 2.03 is it easy to show that 


, 2.06 INNy'| = ker — va) NCI, 
where 
2.07 Cjy = (ry — k)(r — A) 
and 
/ 2.08 Cu = Ak — rd, 
for 7 # u = 1, 2,...,1; d» is the number of treatments common to blocks j 
and 4. 


Let Nz» be the square matrix of side b: 
2.09 N; = 


Then 


sil NN’ N,, 
2.10 V,N,’ = 
; ‘ : (re. i). 


where N,_, is the submatrix of the first b-v columns of N. The principal minor 
determinants of N;N,’ of order up to v are the same as those of NN’ and those of 
higher orders can be calculated from 2.06. 

Let P be matrix 


NN’ 0 
9 -_ 
2.11 r ( 0 CG 2.) 


2.12 Ey, = [r(r — A) Tee. 


Then it is easily verified that the corresponding principal minor determinants 
of N2N,’ and P are equal. Hence 


c,(P) = Cyp(N2N? De 


But we know from [5] that 

cy(P) = Cy(NN’ )cy(Cy-eE v2) (| NN’ ’ Cr gite~s Ip 
for every odd prime p. Hence we have for any odd prime p, 
2.13 Cp(N2N2’) = cy(NN’)ey(Co_--E v2) (|NN’|, |Cy-eEv—o|)> - 


The value of c,(NN’) can be calculated as in [3] and is given by 








416 S. S. SHRIKHANDE 


2.14 ¢,(NN’) = (— 1, rk),(— 1,7 — AA“ — A, rk) (o, rk)p(v, r — X)>- 


3. Impossibility of some A.R.B.I.B. designs. 


THEOREM 1. An A.R.B.I.B.D. with parameters 2.00 does not exist when n 
and t are odd and 

(i) m[(m — 1)t + 1] ts not a perfect square, or 

(ii) m[(m — 1)t + 1] ts @ perfect square and nt = 1 (mod 4) and the square- 
free part of n contains a prime = 3 (mod 4). 


THEOREM 2. An A.R.B.1.B.D. with parameters 2.00 does not exist when n is 
odd and t is even and 

(i) (m — 1)t + 1 ts not a perfect square, or 

(ii) (m — 1)t + lisa perfect square and n + t = 1 (mod 4) and the square-free 
part of n contains a prime = 3 (mod 4). 


THEOREM 3. Am A.R.B.I.B.D. with parameters 2.00 does not exist for any 
value of t, if nm = 2 (mod 4) and the square-free part of n contains a prime = 3 
(mod 4). 


Proofs. Suppose the A.R.B.1.B.D. actually exists, then there are r sets of 
blocks each so that any two blocks of different sets have exactly (m — 1)t+ 1 
treatments in common. Since b — v = n*t+n < n*t+n-+ 1 = 1, wecan pick 
out 6 — wv blocks one from each of the 6 — v sets so that from 2.07 and 2.08 the 
matrix C,_, for these blocks is given by (c,;) where 


3.00 Cyy = n(nt + 1)[(m — 1)¢ + 1] 

and 

3.01 Cm = — [(n — 1)t+ 1] 

forj Xu =1,2,...,m%t +n. It is easily verified that 

3.02 Co. = [(m — 1) + 1" (nt + +17)". 


Let the blocks of the design be permuted so that these are the first m*t + n 
blocks of the design. Taking N2 as in 2.09 we get from 2.06 that 


3.03 |N2N,’ 





_ "tai = 1)t + hatte 





But |N2N2'| = |N;|*. Hence the right-hand side of 3.03 must be a perfect square. 
Hence we get the following results. 
A necessary condition for the existence of the design is that 
(a) n[(m — 1)t + 1] should be a perfect square if both m and ¢ are <dd, and 
(b) (a — 1)¢ + 1 should be a perfect square if m is odd and ft is even. 


In the rest of the paper / stands for an odd prime and will be suppressed in 
the symbol (a, b), whenever no confusion is likely to arise. 








BALANCED INCOMPLETE BLOCK DESIGNS 417 


Using Theorems B and C it is easily verified that 
3.04 c,(NN’) = (— 1, n't + "+ 1)(—1,n)!"?(— 1, (wn — 1) + 178” 
‘(n, n't + + 1)°"((m — 1)t + 1, n't +41)’. 


From 2.12, 3.00, and 3.01 it is seen that for the matrix C,_, E,_,, the diagonal 
elements are (nt + 1)/(n*%* + m+ 1) whereas the non-diagonal elements are 
—1/n(n*t + n + 1). Obviously C,_, Ey», ~ Q, where Q = (¢,,) with 


qi = n* (n't +n-+ 1)(mt + 1) 


and 
Gn = —n(n't+n-+ 1). 
It is easily proved that: 
3.05 IQ] =n (n't + nt yh", (CrErs) = o(Q), 


and 
(\NN’|, |Cr--Eo-e]) = (|NN’|, |Q]), 


where, from 2.05, 


3.06 \NN’| = (n't + n + 1)[n(m — 1)t + 1h", 
Hence 
3.07 ¢(N2N2) = o(NN’)c,(Q)(|NN’|, |Q)). 


The value of c,(Q) can be calculated in exactly the same way as c,(NN’) 
and is given by 


3.08 C»(Q) = (- 1, gr ern, n't of n + ye. 


With these general results we now proceed to consider the various cases. 
First take the case where both m and ¢ are odd. If m[(m — 1)t + 1] is not a perfect 
square the design is impossible. Hence we consider only those values for which 
n |(n — 1)t + 1] is a perfect square. From 3.04 to 3.08, 


cy(NN’) = (— 1, n't ++ 1)((m — 1)t+1, n't ++ 1), 
c,(Q) - (- 1, gprnawnntss,. a + n + 1), 
(\NN’|, |Q|) = (— 1, n°t + 2 + 1). 
Hence 
cp(N2N:’) = (— 1, gg PO order oat = (-—1, nie 


Hence c,(N:N2’) = 1 for those values of m and ¢ for which nt + 1 = 0 (mod 4). 
If however nt = 1 (mod 4) then 








418 S. S. SHRIKHANDE 


cp(N2N2’) = (— 1,m), = (— 1, p), = (— 1/9), 
if is a factor of the square-free part of m. Hence if p = 3 (mod 4), 
Cp(N2N2') = —l. 


But N2N,’ ~ I, and hence c,(N2N,’) = c,(I,) = 1, which is a contradiction. 
Hence the design 2.00 is impossible. This proves Theorem 1. 

Now consider the case when n is odd and ¢ is even. We only consider those 
values of and ¢ for which (m — 1)t + 1 is a perfect square. For these values 
of n and ?, from 3.04 to 3.08 we have 


¢y(NN’) = (— 1, n°t ++ 1)(—1,n), 
¢»(Q) — (- 1, 2jphrrer®. 
(\NN’|, |Q|) = (— 1, )(— 1, nt + 2 + 1). 
Hence 
c>(N2N2’) = (- Lepr - (— Lap. 


Obviously the right-hand side is always 1 except possibly when n + ¢ = 1 
(mod 4), in which case 


¢(N2N2’) = (— 1, 2). 


Hence, as before, if the square-free part of m contains a prime = 3 (mod 4) 
the design is impossible. This proves Theorem 2. 
Lastly, consider the case when n is even. Then 


cy(NN’) = (— 1, n't+n+1)(n,nt+n4+ 1), 
c(Q) = (— 1, 2)"(n, n't + n + 1), 
(INN’|, |Q|) = (— 1, nt + 0 + 1). 
Hence 
cy(N2N2’) = (— 1,2)". 


The value of the right-hand side is always 1 except possible when n = 2 (mod 4) 
in which case 


cy(N2N2’ ) = (-— 1, ). 


Hence, as before, if the square-free part of m contains a prime = 3 (mod 4) 
then the design is impossible. This completes the proof of Theorem 3. 

It is obvious that the above results are the best possible using this particular 
method. 


CoROLLARY. Puttingt = Oin Theorems 2 and 3 above we get that the A.R.B.1.B.C. 
with parameters 








BALANCED INCOMPLETE BLOCK DESIGNS 419 


ven, ,b=n+n,r=n+1,k=n,r=1 


is impossible when n = 1 or 2 (mod 4) and the square-free part of n contains a 
prime = 3 (mod 4). This is equivalent to the result given by Bruck and Ryser 


(3). 


4. Improvement of an inequality for orthogonal arrays of strength 2. 
Consider a matrix A = (a,;) with m rows and N columns where each element 
a, represents one of the integers 0, 1, 2,..., — 1. Consider all the d-rowed 
submatrices that can be formed (d < m). Each column of any d-rowed submatrix 
gives an ordered d-plet. There are mn‘ possible d-plets. If in the N d-plets obtained 
from every submatrix each of the n‘ possible d-plets occurs exactly « times 
(N = un*), then the matrix is called an orthogonal array (N, m, n, d) of size 
N, m constraints, » levels, and strength d. The idea of orthogonal arrays which 
is very useful in certain combinatorial problems is due to Rao [8]. The multi- 
factorial designs considered by Plackett and Burman [7] are orthogonal arrays 
of strength 2. Give the values of n, d, and N (= yun*), let f(N, n, d) represent 
the maximum number of constraints possible. Then it is known [7] that 


2 
4.1 flun®, m,2) <I (z= 3) 
n— 1 
where J (x) is the integral part of x. In some cases this inequality can be improved. 
When yu — 1 is not divisible by m — 1, Bose [2] has given the following 
TueoreM D. Jf »—1=a(n—1)+6,0<b <n — 1 and | is the largest 

non-negative integer consistent with 

n(b — 21) > (6 —1)(6 —1 + 1), 
then 

2 (e" — ] 
f (un ’ n, 2) < I a —1) —l—1. 


The results of the previous section can be used to improve the inequality 
4.1 in some cases when » — | is actually divisible by » — 1. 

If (un? — 1)/(m — 1) is an integer (which implies that (um — 1)/(m — 1) 
is also an integer) and further if an orthogonal array exists with the maximum 
possible number of constraints which is (um* — 1)/(m — 1), then such an array 
is said to be complete. It has been shown [7] that the existence of a complete 


orthogonal array 
( n* un — I n 2) 
me, 1° 4 


implies the existence of an A.R.B.I.B.D. with parameters 


7-1 —1 
v=nk=yun’, b=nr=n (ex! ), i, = = 








420 S. S. SHRIKHANDE 


and conversely. Hence, in particular, a complete orthogonal array (m?[(m — 1)é 
+ 1], 2% + n+ 1,2, 2) and an A.R.B.I.B.D. with parameters 2.00 are co- 
existent. The theorems of the previous section can, therefore, be expressed in 


terms of the non-existence of the corresponding complete orthogonal arrays. 
Hence 


f(n*[(n — 1)t + 1], n, 2) < vt +n, 


for the values of m and ¢ given in the theorems of the last section. 


REFERENCES 


1. R. C. Bose, A mote on the resolvability of balanced incomplete block designs, Sankhya, 6 
(1942), 105-110. 

, Mathematics of factorial designs, Proceedings of the International Congress of 
Mathematics, 1 (1950), 543-548. 

3. R.H. Bruck and H. J. Ryser, Non-existence of certain finite projective planes, Can. J. Math., 
1 (1949), 88-93. 

4. W.S. Connor, Jr., On the structure of balanced incomplete block designs, Annals of Mathemati- 
cal Statistics, 23 (1952), 57-71. 

5. B. W. Jones, The arithmetic theory of quadratic forms (New York, 1950). 

6. Gordon Pall, The arithmetic invariants of quadratic forms, Bull. Amer. Math. Soc., 51 
(1945), 185-197. 

7. R. L. Plackett and J. P. Burman, The design of optimum multifactorial experiments, Bio- 
metrika, 33 (1943-46), 305-325. 

8. C. R. Rao, Factorial experiments derivable from combinatorial arrangements of arrays, 
J. Royal Statistical Society Suppl., 9 (1947), 128-139. 

9. S. S. Shrikhande, Impossibility of some affine resolvable balanced incomplete block designs, 
Sankhya, 11 (1951), 185-186. 


2. 





University of Kansas 





CONCERNING DIFFERENCE SETS 
T. G. OSTROM 


A set of integers {a@o,a:,...,@,} is said to be a difference set modulo N 
if the set of differences {a, — ay} (i,7 = 0,1,...,) contains each non-zero 
residue mod N exactly once. It follows that N and nm are connected by the 
relation N = n? + n+ 1. If {ao, ai,..., @,} is a difference set mod N, so is the 
set {ao + 5,4: + 5,...,@, +5} (s = 0,1,...,N). These difference sets form 
a finite projective plane of N points, with each difference set constituting a 
line in the plane. Conversely, given a finite projective plane of N points and a 
cyclic collineation of order N, the collineation leads to a numbering of the 
points so that each line becomes a difference set. Singer [5] has shown that a 
difference set can be constructed whenever n is a prime power and has conjec- 
tured that there are difference sets in no other cases. Hall [3] has shown that 
there are no difference sets for any composite m less than or equal to 100 and 
Mann and Evans [2] have extended this result to m less than or equal to 1600. 

Hall [3] has defined a “multiplier” as any number g such that the set {ga,} 
(¢ = 0,1,...,) is the same as the set {a, + s} (j = 0,1,...,) for some s. 
He has shown that every factor of is a multiplier. He has also shown that 
for each N which permits a difference set, there is at least one difference set 
which is fixed by all multipliers. Mann [4] has shown that if there is a difference 
set mod N and a multiplier of even order, » must be a square. 

In this paper we show that, under certain conditions, the multipliers form a 
cyclic group. We usc this result to obtain extensions of some theorems of Mann 
and Evans [2] concerning the possible orders of multipliers of a difference set. 
These theorems have a definite bearing on the question as to which values of n 
permit difference sets. Mann and Evans used their theorems, along with some 
other results, to show that no difference sets can exist when m is a composite 
number less than 1600. On the basis of computations in individual cases, the 
author conjectures that theorems of this type may eliminate all composite 
values of m, thus leading to a complete solution of the problem. 


DEFINITION. Let N, be a prime factor of N. (1) We shall say that JN, is of 
type I if there is some multiplier g (mod N) such that the exponent to which g 
belongs mod N is greater than the exponent to which it belongs mod N,. (2) We 
shall say that JN, is of type IJ if every multiplier mod N belongs to the same 
exponent mod N as it does mod N,. 


Remark. No divisor of zero mod N can be a multiplier since, if a difference 
set {@o, @:1,...a,} be multiplied by a divisor of zero, at least one of the differ- 





Received August 12, 1952; in revised form October 15, 1952. 


421 











422 T. G. OSTROM 


ences {a, — a,} (i + j) will be carried into zero. Hence every prime factor of N 
is either of type I or type II. 


THEOREM 1. Suppose that there is a difference set mod N and that N has a 
prime factor N’ of type 1. Let t be the exponent to which the multiplier q (of the 
definition above) belongs mod N’. Let N, = (q‘ — 1, .N). Then (a) N’ divides 
N, # N, (b) N, is of the form n,* + m, + 1, (c) there is a difference set mod N, 
and every multiplier for the difference sets mod N is a multiplier for the difference 
sets mod N. 


Proof. The proof follows immediately from Hall [3, Theorem 4.5] and the 
definition of a factor of type I, since g‘ is a multiplier. 


Remark. If N,; in Theorem 1 is less than 1600? + 1600 + 1, m; must be a 
prime power. If, in addition, m is divisible by 2 or 3, then m, must be a power of 
2 or 3 respectively, since Hall has proved that 2 can be a multiplier only if 
is even, while Mann [4] has proved that 3 can be a multiplier only if m is 
congruent to zero mod 3. 


COROLLARY 1. Suppose that n = m’, where (r,3) = 1 and there is a difference 
set mod N = n*? + n + 1. Then there is a difference set mod N; = m*?*+m-+ 1 
and every multiplier mod N is a multiplier mod N,. 


Proof. Let (m — 1, N) = N’. Then m = 1 (mod N’), 
N =m" +m’ +1=m’+m+1=3 (mod N’). 


Hence N’ = 1 or 3 and, if N’ = 3, m? + m+ 1 =0 (mod 3). In no case is 
n?+n-+12=0 (mod 9). Hence (m*—1, m* +m’ +1) = m’+m+1, 
provided (r,3) = 1. 


COROLLARY 2. The only difference sets for n less than 1600? in which there is a 
multiplier of even order are those in which n is an even power of a prime. 


Proof. Mann has proved that if there is a multiplier of even order, » must be 
a square. If m = m’, there is a difference set mod (m? + m + 1). If m < 1600, 
m must be a prime power. 


THEOREM 2. If N contains a prime factor N, of type 11, the multipliers form a 
cyclic multiplicative group. 


Proof. The product of two multipliers is a multiplier. Obviously the multi- 
pliers form a group. Let the multipliers be reduced mod N,. Since N, is prime, 
the images of the multipliers in the residue system mod N, form a cyclic group. 
We shall show that any two different multipliers g, and g2 have different images 
in the residue system mod N,. Suppose that g: = g2 (mod N,), where q; and gz 
are multipliers. Since the multipliers form a group, we may write g2 = 91s 
(mod N), where q; is a multiplier. Thus g: = g2 = 9g: g3 (mod Nj) and q:(gs — 1) 
= 0 (mod N,). Now gq: # 0 (mod Nj) since divisors of zero cannot be multipliers. 





CONCERNING DIFFERENCE SETS 423 


But JN, is prime, so g; — 1 = 0 (mod JN). Since N, is of type II, this implies 
that g; — 1 = 0 (mod N) and hence g; = g: (mod N). Thus the mapping of 
multipliers is 1 to 1, and the multipliers must form a cyclic group mod N. 


THEOREM 3. Suppose that: 
(1) there is a difference set mod N = n* + n + 1, 
(2) N = N, Nz... Nz, where N, is prime (i = 1,2,...,k), 
(3) m is not a square, 
(4) for some i, N, is of type II; 
then the order S of the group of multipliers is odd and divides o(N,) = N, — 1. 


Proof. If S is even, the order of the primitive multiplier is even and m must 
be a square. The order of any non-zero residue mod N, divides ¢(N,). If N;, is 
of type II, the order of every multiplier mod N,, is the same as its order mod N. 


THEOREM 3.1. If the hypotheses of Theorem 3 are valid and 
(5) N, ts of type II fort = 1,2,...,h, 
(6) » + 1 = 0 (mod 3), 
then S divides n + 1. 


Proof. Mann and Evans have shown that 0 is not contained in the difference 
set fixed under all multipliers if m + 1 = 0 (mod 3). Let g be the primitive 
multiplier. Then (for any number a # 0) if a is in the fixed difference set, 
a, ag,...,aqg*—' are all incongruent mod WN and all included in the fixed differ- 
ence set. The 2 + 1 numbers in this fixed set therefore occur in subsets of 
S each. 


THEOREM 3.2 Jf (1), (2), (3), (5) are all satisfied and n = 0 (mod 3) then S 
divides n. 


Proof. Mann and Evans have shown that 0 is contained in the fixed difference 
if m =0 (mod 3). As in Theorem 3.1, the ” non-zero numbers in the fixed 
difference set occur in subsets of S each. 


Remark. If n — 1 = 0 (mod 3), N = 0 (mod 3) and 3 is a factor of type I. 


THEOREM 3.3 Suppose that: 
(1) there is a difference set mod N, 
(2) N = N,Nz (Ni and N;, not necessarily prime), 
(3) every factor of Nz is of type II with respect to N, 
(4) (q@ — 1, N) = Ni, where a < S and q is the primitive multiplier; 
then N, is of the form n,* + n; + 1 and S dwides n — ny. 


Proof. By Theorem 1, there is a difference set mod N, and N; = n,;? + m, + 1. 
By Mann and Evans [2, Theorem 6], there are m; + 1 multiples of N, in the 
fixed difference set. If a is any residue mod N which is not a multiple of Ns, 
let t be the least power of the primitive multiplier g such that ag‘ = a (mod N). 
Then a(qg‘ — 1) = 0 (mod N = N,N). Hence g‘ — 1 = 0 mod some factor of 








424 T. G. OSTROM 


N;. Since all factors of N2 are of type I] with respect to N, t = S. Thus the 
m — m, residues in the fixed difference set which are non-multiples of N: occur 
in sets a, ag,...,aq*~' of S each, and S divides n — m,. 


Theorems 3.1, 3.2, and 3.3 are extensions of Corollaries 5.1, 5.2, and Theorem 
9, respectively, of Mann and Evans [2]. The proofs are similar in form to those 
given in [2]. 

As an example of the way in which Theorem 1 can be applied, suppose that n 
is even and m = 5 or 25 (mod 31). Then 2 is a multiplier and (2 — 1, N) = 31. 
But 2 is not a multiplier for the difference set mod 31; hence, by Theorem 1, 
there can be no difference set mod N. 

As an example of the application of the other theorems, consider the case 
n = 411, N = 313-541. Neither 313 nor 541 is of the form n,;? +n, + 1; 
hence, by Theorem 1, neither can be of type I. By Theorem 3, S divides 313 — 1 
= 3-8-13 and 541 — 1 = 4-5-27. By Theorem 3.2, S divides n = 3-137. 
Hence S must be 3. Since 3 (which should be a multiplier) is not of order 3, 
there can be no difference set. 


REFERENCES 


1. R. H. Bruck and H. J. Ryser, The nonexistence of certain finite projective planes, Can. J. 
Math., 1 (1949), 88-93. 

2. T. A. Evans and H. B. Mann, On simple difference sets, Sankhya, 11 (1951), 357-364. 

3. Marshall Hall, Cyclic projective planes, Duke Math. J., 14 (1947), 1079-1090. 

4. H. B. Mann, Some theorems on difference sets, Can. J. Math., 4 (1952), 222-226. 

5. James Singer, A theorem in finite projective geometry and some applications to number theory, 
Trans. Amer. Math. Soc., 43 (1938), 377-385. 


Montana State University 


———_ 








_—_—— 


ON RESIDUE DIFFERENCE SETS 
EMMA LEHMER 


1. Introduction. In recent years the subject of difference sets has attracted 
a considerable amount of attention in connection with problems in finite geo- 
metries [4]. Difference sets arising from higher power residues were first 
discussed by Chowla [1], who proved that biquadratic residues modulo p form 
a difference set if (p — 1)/4 is an odd square. In this paper we shall prove a 
similar result for octic residues and develop some necessary conditions which 
will eliminate all odd power residue difference sets and many others. We also 
prove that a perfect residue difference set (that is, one in which every difference 
appears exactly once) contains all the powers of 2 modulo p. 


DEFINITION. An nth power residue difference set of multiplicity \ with respect 
to a prime p is the set 
11, 72,--+5T%e 


of nth power residues of a prime p = kn + 1, which is such that if we form all 
the k(k — 1) non-zero differences 


Ta — 1, (mod f) (a # b), 
we will obtain every positive integer < p — 1 exactly A times. Hence 
A= (k—1)/n and p=)dn*?+n+1. 


If \ = 1, the set will be called a perfect residue difference set. In this case 
k=n+landp=n*?+n+1. 

In order to study these sets efficiently we will need to use some properties of 
the cyclotomic numbers (i, 7) introduced by Gauss and developed by Dickson 
[2], together with an additional lemma about their parity, which is a generaliza- 
tion of a lemma given by the author in an earlier paper [5]. 


2. Cyclotomic numbers. Let » = nk + 1 bea prime and let g be a primitive 
root of p. We shall say that a number N belongs to the residue class 7 with 
respect to g if N = g™** (mod p). The cyclotomic constant (7, 7) denotes the 
number of members of the residue class 7 which are followed by a member of 
the residue class j, or in other words, the number of solutions of the congruence 


gn +l= gare (mod p), 
where 1 and j are < nm — 1, while vy and uw are < k — 1. 


Received August 29, 1952. 











426 EMMA LEHMER 


We shall borrow the following properties of cyclotomic numbers (all of which 
can be very readily derived) from Dickson [2, p. 394, (14), (15), (17)]: 


2.1 (i,j) = (, 2), (n —i,j7 — 1%) = (i,7), k even 

2.2 (t,7) = G+ 4n,i + 4n), (n —1,j7 —1) = (4,7), k odd 
n—1l 

2.3 > @&j) =k—-e& (i =0,1,...,”—1), 
j=0 


where 


tol 
x 


vi 1 if & is even and i = 0, or if 2 is odd andi = ? 
** 1 0 otherwise. 


LemMa I. The cyclotomic numbers (0, 7) are odd or even according as 2 belongs 
to the residue class j or not. 


Proof. For every pair r, r + 1 such that r is a residue and hence belongs to 
residue class zero, while r + 1 belongs to the residue class j, there corresponds 
a pair 7, # + 1, where r? = 1 (mod p), which is also such that 7 belongs to class 
zero while 7 + 1 = 7 (r + 1) belongs to class j7. Therefore the contribution to 
the cyclotomic number (0, 7) is even unless r = 7. This implies that r is either 
1 or p — 1. The case r = p — 1 does not produce a solution since r + 1 = 0 is 
not admissible, while the case r = 1, r + 1 = 2 gives an unpaired solution if 
and only if 2 belongs to class 7. Hence the lemma. 


3. Connection between residue difference sets and cyclotomic constants. 


THEOREM I. A necessary and sufficient condition that the class of nth power 
residues form a difference set is that the cyclotomic numbers 
(4,0) = (k—1)/n (¢=0,1,...,#—1), 
where (k — 1)/n = d is the multiplicity of the difference set. 


Proof. First suppose that the residues form a difference set of multiplicity \ 
so that for every positive integer d there are \ solutions of the congruence 
’,. — t, = d (mod p). Multiplying this congruence by 7,, we have 


d?, +1=ni> (mod p). 


We note that the right-hand side belongs to the residue class zero, and that 
d?, belongs to the same class as d. Denoting this class by 1 we have (7,0) = X. 
But since d was arbitrary this must hold for all 7. 

Conversely, if all the (i, 0) are equal, then 


(i,0) = ¥ (i,0)/n. 


But it follows readily from 2.3 with the help of either 2.1 or 2.2 that in all 
cases 








ON RESIDUE DIFFERENCE SETS 427 


n—l 
> G0) =k-1, 


hence the common value of all the (7,0) is in fact (k — 1)/nm. Moreover, the 
correspondence set up in the first part of the proof is obviously one to one, 
hence the residues form a difference set of multiplicity \ = (k — 1)/n if all the 
(i, 0) are equal. 


THEOREM II. There exists no residue difference set for n odd; or for n even and 
k even. 


Proof. lf m is odd, then k must be even, but for even k we have by 2.1 the 
equality (0,7) = (7,0). Hence Theorem | states in this case that the cyclotomic 
constants (0, 7) are all equal. But by Lemma I one of these quantities is odd while 
the others are even. Hence we have arrived at a contradiction and the theorem 
follows. 


THEOREM III. Jf m is even and k = (p — 1)/n is odd, then a necessary and 
sufficient condition for the set of nth power residues modulo p to form a difference set 
ts that 

(1,0) = (k—1)/n (ij =0,1,...,4n— 1). 


Proof. \t follows readily from 2.2 that 


(¢ + $n, 0) = (7,0), 


hence the theorem follows from Theorem I. 


4. Multipliers. The notion of a multiplier was introduced by Hall [4] and is 
as follows: A number ¢ is called a multiplier of a set r;, ro,..., r, if the set 
tr:, tre, ..., tf, is congruent to the set 7; + s, re + 5,..., 7% + 5 in some order 
for some number s. The following theorem is true of multipliers of residue 


difference sets. 
THEOREM IV. The set of multipliers of a residue difference set is the set itself. 


Proof. That every element of the set is a multiplier is obvious because it 
leaves the set unaltered and s = 0. Suppose now that we have a multiplier ¢ 
which is not in the residue set and let ¢ belong to the residue class r ~ 0. Then 
all the numbers #r;, tre, . . . , tr, will also belong to the residue class r. Hence in 
this case s ~ 0. Let s belong to the residue class ¢. The congruence r, + s = fr, 
(mod p) implies, by multiplying by 3, that 7,3 + 1 = ¢3r, (mod p). But the num- 
ber of solutions of the last congruence is (n — o, r — o) = k, but by 2.1 or 2.2 
this implies (¢, r) = k. But by 2.3 


n—l 
y (c,j) <k, 
= 


hence all (¢, 7) = 0 for 7 # r. But, for a difference set (¢,0) = (k — 1)/n # 0. 
Hence we have arrived at a contradiction and the theorem follows. 








428 EMMA LEHMER 


5. Perfect residue difference sets. Hall [4] has proved that for A = 1 every 
divisor of m is a multiplier of any difference set modulo m* + n + 1. He also 
proved that 2 and 3, as well as 18 other pairs of numbers, cannot both be 
multipliers. We now apply these results to residue difference sets. 


THEOREM V. A perfect residue difference set contains all the powers of 2 modulo p. 


Proof. Since, by Theorem II, m must be even, 2 divides m and hence is a 
multiplier of the difference set by Hall’s theorem. But every multiplier is in the 
set by Theorem IV, hence 2 is an mth power residue and hence all powers of 2 
are mth power residues and are in the set. 


Coro.iary. If the exponent of 2 (mod p) is exactly n + 1, then the set consists 
of powers of 2 exclusively. 


THEOREM VI. The only perfect residue difference sets with p < 2561600 are 
for n = 2, p =7 and for n = 8, p = 73. 


Proof. Evans and Mann [3] have recently proved that there exists no perfect 
difference set for nm < 1600, unless 7 is a prime or a power of a prime. In our 
case since n* + n + 1 must be a prime ~, m must be of the form 2+". The 
only such m < 1600 which lead to prime values of p are m = 2, 8, and 512. 
The first two lead to well-known sets of quadratic residues 1, 2, 4 (mod 7), and 
octic residues 1, 2, 4, 8, 16, 32, 37, 55, and 64 (mod 73), respectively. 


The remaining case of p = 262657, nm = 512, satisfies, as far as the writer has 
been able to ascertain, all known necessary conditions for a difference set. 
It has no multipliers other than powers of 2 less than 783, and can be generated 
by 7837, a = 0,1,..., 512. An inspection of the set shows however, that 


3 = 783°" — 783°" = 783°° — 783™ (mod 262657) 


=4—1 


Mil 


89788 — 89785. 
Hence this is not a perfect difference set after all. 


6. Special values of n. For » = 2, Theorem III gives no further restriction 
on p beyond A} = (0,0) = (p — 3)/4, which is satisfied. Hence there exists a 
difference set of quadratic residues for all p = 3 (mod 4). This is a well-known 
result. By Theorem II we need to consider only odd values of &. 

For n = 4, the cyclotomic constants were given by Gauss in terms of the 
quadratic partition p = x? + 4y?, x = 1 (mod 4). 


16(0,0) = p + 2x —7, 16(1,0) = p — 2x —3. 


The condition (0,0) = (1, 0) impliesx = lor p = 1 + 4y*. Hencek = (p — 1)/4 
= y*. Since k is odd, k must be an odd square, which is Chowla’s theorem. 

For n = 6, the cyclotomic constants can be easily derived from Dickson's 
results in terms of the quadratic partition » = A? + 3B’?, A = 1 (mod 3). 








ON RESIDUE DIFFERENCE SETS 429 


Case 1. 2 isa cubic residue. In this case the condition 
36(0,0) = p — 11 — 8A = 6(k — 1) 


leads to A = —}, which is impossible, since A is an integer. 
Case 2. 2 is a cubic non-residue. In this case 


36(0,0) = p — 11 — 2A = 6(k — 1) 
leads to A = — 2, while 


36(1,0) = p-—5+4A+6B, 36(2,0) = p—5— 2A — 6B 


or 

36(1,0) = p—5-—2A+6B, 36(2,0) = p~—5+4A — 6B 
(according as 2 belongs to class one or two), so that the condition (1,0) = (2, 0) 
implies 2B = —A = 2, or B = 1 and p = 7. This is a trivial case since the only 


sex*ic residue modulo 7 is r = 1, so that in this case A = 0. In other words, 
there exists no difference set of sextic residues. 

For n = 8, a little more work is required to derive the needed cyclotomic 
numbers from the groundwork laid by Dickson. These numbers are given in 
terms of the quadratic partitions. 


p=Hat+ 2 =x°+4y=8+lasx=l (mod 4) 
Case 1. 2 is a quartic residue, then 
64(0,0) = p— 15 — 2x, 64(2,0) = p— 2x —-8a -—7 


64(1,0) = 64(3,0) = p — 7 + 2x + 4a. 


The condition 
(0,0) = (1,0) = (2,0) 


implies a = 1, x = —3. Hence 


(3,0) = (p — 9) /64 


p = 1+ 20? =9+ 4y’, or B — 2y’ = 4. 


Letting 5 = 2t, y = 4u, we have the condition 


? — 8u’ = 1, 


where k = (p — 1)/8 = # and A = (p — 9)/64 = u*. The first non-trivial 
solution of this Pell equation is ¢ = 3, u = 1, giving p = 73. The even-ordered 
solutions of this Pell equation lead to values of » which are multiples of 3. The 
odd-ordered solutions give odd values of u and lead to values of p which satisfy 
the recurring series 


Pm = 1154p_-1 — Pm-2 — 5760, po = 73, pi = 73. 








430 EMMA LEHMER 


This gives 
b2 = 78409 = 89- 881 
bs = 90478153 = 4993 - 18121 
bs = 104411704393 = prime 
bs = 120491016385609 = 1721 - 70012211729. 


These factorizations were made on the SWAC. The prime /, is the modulus 
for a difference set with k = 114243? elements of multiplicity 4 = 403917. 
Case II. 2 is a quartic non-residue. In this case 


64(0,0) = p— 15 — 10x — 8a, 64(2,0) =p+6x —7 


and the condition (0,0) = (2,0) = (p — 9)/64 leads to x = —}, but this is 
impossible. We can therefore summarize our results on octic residues in the 
following theorem. 


THEOREM VII. The set of octic residues modulo p forms a difference set if and 
only if the number of terms k = (p — 1)/8 and the multiplicity } = (p — 9)/64 
are both odd squares. 


COROLLARY. An octic residue difference set contains all powers of 2 modulo p. 


It is known that the condition for octic residuacity of 2 is that }y be odd! 
for p = 9 (mod 16). But in our case u = }y is odd. Hence 2, and therefore all 
its powers, are octic residues. 

Finally, we discuss briefly the impossibility of residue sets for m = 10, when 2 
is a quintic residue. The cyclotomic numbers for m = 10 are given in terms of 
the solutions of 


16p = x” + 50u? + 50v® + 125w’, xw =v — 4uv — x’. 
If 2 is a quintic residue, Dickson gives for k odd 
100(0,0) = p — 19+ 8x 


and the condition (0,0) = (p — 11)/100 implies x = 1; but it has been proved 
by the author [5], that if 2 is a quintic residue, x must be even. Hence we have 
arrived at a contradiction and there is no difference set in this case. 

The cyclotomic numbers have not been worked out in sufficient detail to 
complete the case in which 2 is a quintic non-residue. The same holds true for 
larger values of m such as 12 and 16, although a certain amount of work has been 
done in these cases. 








!This can be made to follow readily from Lemma I and the expression for (0, 0) in the octic 
case. In fact for p= 9 (mod 16) and (0, 0) odd we have 64 (0, 0) = p — 15 — 2x = 64 
(mod 128). Hence x = 8v + 29 (mod 64), which implies y= 4 (mod 8). Similarly if 2 is an 
octic non-residue (0, 0) is even and y= 0 (mod 8). 














ON RESIDUE DIFFERENCE SETS 431 


7. Modified residue difference sets. Hall points out that Theorem II holds 
if zero is,counted as a residue and that we can obtain further residue sets for 


quartic residues. We will show that this can also be done for octic residue sets, 
but that no other new cases arise. 


If zero is counted as a residue, then the multiplicity \ is given by \ = (k + 1)/n 
and we have an analogue of Theorem III, namely 


THEOREM III’. Jf n is even and k = (p — 1)/n is odd, then a necessary and 


sufficient condition for the set of nth power residues and zero to be a difference set 
is that 


1+ (0,0) = (4,0) = (k+ 1)/n, #=1,2,...,4n8-—1. 
We now discuss the cases m = 4 and m = 8. For n = 4 the conditions of 
Theorem III’ give 
p+2x+9=p—-2x -—-3 = p+3. 


This implies x = —3so that p = 9 + 4y*.Sincek = (p — 1)/4isodd, y = k —2 
must also be odd and we have an analogue of Chowla's theorem: 


The quartic residues and zero form a difference set modulo p if and only if 
k — 2 = (p — 9)/4 ts an odd square. 


For n = 8, if 2 is a quartic residue, we have by Theorem III’ 
p+49 —2x = p—2x —-8a-7=p+7. 


This implies x 


21,a = —7. Hence 


= 49 + 2) = 441+ 4y’ or b* — 2y’ = 196. 


a>) 


Letting, as before, b = 2t, y = 4u, we have 
? — 8u* = 49. 
As before, this leads to a sequence of p’s which can be defined by the recurrence 
bm = 1154p_—-1 — Pm—2 — 282240, 
where pf) = 697, 1 = 26041, and m can take on positive and negative values. 


The smallest prime value in this sequence is ~; = 26041, and there are no 

other primes less than p_; = 34352398777. The prime p = 26041 gives a differ- 

ence set with k = 3255, \ = 407. This set contains 3256 elements including 0 

and all 465 powers of 4 modulo 26041. It can be generated by powers of 7. 
If 2 is a quartic non-residue we have 


64(2,0)=p + 6x — 7 = p +7, 


which is impossible. Hence we can state an analogue of Theorem VII, namely 





432 EMMA LEHMER 


THEOREM VII’. The set of octic residues and zero forms a residue set modulo p 
if and only if k — 6 = (p — 9)/8 is an odd square, while} — 7 = (p -+ 441)/64 
is an even square. 


That (p — 441)/64 = wu? is even follows from the fact that this time the odd 
values of u gave multiples of three for », and are therefore eliminated. Since 
\ — 7 is even, A is again odd. Since u = }y is even, 2 is not an octic residue 
(see footnote 1). Hence we can state: 


COROLLARY. An octic residue difference set which includes zero contains all 
powers of 4 modulo p. 


It can be easily seen, as before, that the cases nm = 2, 6, and 10 lead to contra- 
dictions, so that we have been able to discover difference sets for only m = 4 and 
8. It would be of interest to find out if the next possible case ism = 12 orn = 16. 

In order to get a perfect residue difference set, when zero is counted as a 
residue, we must have 


p=n'—n+1 = (n—1)°+ (n—1) +1. 


For quartic residues n = 4, p = 13, the numbers 0, 1, 3, 9 form such a set, 
but there is none for octic residues since 7? + 7 + 1 = 57 is not a prime. 


REFERENCES 

1. S. Chowla, A property of biquadratic residues, Proc. Nat. Acad. Sci. India, Sec. A, 14 (1944), 
45-46. 

2. L. E. Dickson, Cyclotomy, higher congruences and Waring's problem, Amer. J. Math., 57 
(1935), 391-424. 

3. T. A. Evans and H. B. Mann, On simple difference sets, Sankhya, 2 (1951), 357-364 

4. Marshall Hall, Jr., Cyclic projective planes, Duke Math. J., 14 (1947), 1079-1090. 

5. Emma Lehmer, The quintic character of 2 and 3, Duke Math. J., 18 (1951), 11-18 


Pacific Palisades, California 








